glencoesoftware / bioformats2raw

Bio-Formats image file format to raw format converter
GNU General Public License v2.0
77 stars 35 forks source link

Batch processing #206

Open Pablo1990 opened 1 year ago

Pablo1990 commented 1 year ago

Hi!

Thank you for this package! Is it possible to batch process a folder using the command line without having to code it like bash or python-based scripting?

I was expecting something like this:

bioformats2raw inputFolder outputFolder

And it will go through all the files (e.g., file1.tiff) and export them to file1.zarr.

Many thanks, Pablo

melissalinkert commented 1 year ago

We don't have plans to implement batch processing here in the near future. NGFF-Converter may be of interest, as that does allow defining a whole queue of files to convert (including by dragging and dropping folders).

Many of the formats supported by Bio-Formats/bioformats2raw involve multiple files - in the case of HCS data, a single dataset may have thousands of files. Adding batch processing to bioformats2raw would require an additional "scanning" step to build a minimal list of datasets that represent all files in the selected folder. For many input formats, simply iterating over all files in a folder is likely to result in duplicate conversions.

We have implemented a similar "scanning" step for OMERO import and internal Bio-Formats testing, but in both cases there is a noticeable performance impact. We'll need to think about whether it makes sense to add this feature to bioformats2raw as well.