dandi / dandisets-healthstatus

Healthchecks of dandisets and support libraries (pynwb and matnwb)
0 stars 1 forks source link

Add parallelization across dandisets #9

Closed yarikoptic closed 1 year ago

yarikoptic commented 1 year ago

As locking at fsspec/datalad-fuse level virtually causes serial processing within each dandiset, I think we should parallelize across dandisets, so that with locking per each dandiset at fsspec level we would still be parallel across them and have reasonablish run time overall.

jwodder commented 1 year ago

@yarikoptic How many Dandisets should be processed at once?

jwodder commented 1 year ago

@yarikoptic Ping.

yarikoptic commented 1 year ago

should be a CLI option. I don't know what would be the optimal -- may be we could scale to number of CPUs or above! Default -- I don't care, could be 1 (no parallelization)