dandi / dandisets-healthstatus

Healthchecks of dandisets and support libraries (pynwb and matnwb)
0 stars 1 forks source link

Operate on one asset per Dandiset at a time #28

Closed jwodder closed 1 year ago

jwodder commented 1 year ago

Closes #24.

This PR includes #26.

This PR takes the easy approach of just setting the number of per-Dandiset workers to one. There will still be multiple tasks doing I/O on a given Dandiset at once (one task iterating over directories, the other running one test on one asset at a time). This also makes it easy to revert to processing multiple assets at once in the future by just setting the variable to a larger number again.

yarikoptic commented 1 year ago

Looked at https://github.com/dandi/dandisets-healthstatus/pull/28/commits/e156f7b072412df9bb2409d3c46776ff02e4b8f9 1-liner.

I will merge it, but could you please check on e.g. a choice of 2 dandisets with stock fsspec/datalad-fuse that we are doing ok and not trying to open simultaneously multiple assets/files within each of those dandisets?

jwodder commented 1 year ago

@yarikoptic How exactly should I check that?

yarikoptic commented 1 year ago

I would have looked at the logs (first ensure that I do log) that we do not start testing/opening a new asset in the dandiset before we are done with the previous one. Sure thing could also be straceing the process on open/close but it would be too noisy I guess.

jwodder commented 1 year ago

@yarikoptic Asset testing appear to be properly non-parallel now.