If we have one 'family' of files (all files that must be transferred together such that each file in the group is only filed once during processing), then we want to run all parsers on those files in parallel. Right now the files are transferred in parallel and the parsers are applied serially. The parsers are lightweight, so this isn't a huge overhead -- but could easily add hours to total processing time when we have tens of millions of groups (e.g., MDF).
Note -- this should only be used should Xtract have a local mode. Currently via funcX, it makes no sense to do this as each core/worker is already processing a file.
If we have one 'family' of files (all files that must be transferred together such that each file in the group is only filed once during processing), then we want to run all parsers on those files in parallel. Right now the files are transferred in parallel and the parsers are applied serially. The parsers are lightweight, so this isn't a huge overhead -- but could easily add hours to total processing time when we have tens of millions of groups (e.g., MDF).