dandi / dandi-cli

DANDI command line client to facilitate common operations
https://dandi.readthedocs.io/
Apache License 2.0
21 stars 25 forks source link

hidden resource fork files break organize command on mac with external drive #61

Closed bendichter closed 4 years ago

bendichter commented 4 years ago

Nac users with an external drive are likely to have hidden "resource fork" files that have the prefix "._". These files are invisible and can normally be safely ignored, but they break the organize command. These files are difficult for mac users to identify, as they are hidden in Finder, and only findable with ls -a on the command line. I think it would be better to ignore them on organize. If not, then maybe we could throw an error right away, instead of waiting for all of the meta-data to be gathered (for 18 minutes in this case) before throwing the error.

(base) Bens-MacBook-Pro-2:dandi_staging bendichter$ dandi organize -d 000008 /Volumes/easystore5T/data/Tolias/nwb -f symlink
2020-03-17 14:43:47,494 [    INFO] Loading metadata from 1319 files
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed:    9.1s
[Parallel(n_jobs=-1)]: Done   9 tasks      | elapsed:   10.4s
...
[Parallel(n_jobs=-1)]: Done 1285 tasks      | elapsed: 18.1min
[Parallel(n_jobs=-1)]: Done 1319 out of 1319 | elapsed: 18.6min finished
2020-03-17 15:02:25,252 [ WARNING] Completely empty record for /Volumes/easystore5T/data/Tolias/nwb/._20171204_sample_2.nwb
Error: 1 out of 1319 files were found not containing all necessary metadata: /Volumes/easystore5T/data/Tolias/nwb/._20171204_sample_2.nwb
bendichter commented 4 years ago

It looks like the caching saved me from waiting another 18 min :-). Well done, @yarikoptic

yarikoptic commented 4 years ago

It looks like the caching saved me from waiting another 18 min :-). Well done, @yarikoptic

as long as you do not upgrade anything -- you will be the "fast ben" now ;) yeah -- caching helps. also joblib doing metadata reading in parallel too

Re, the issue, I have not realized that, we should exclude all "dot files" by default. I will send a PR shortly. It might still interfere with e.g. cleanup operations (I remove directories which are empty after the "move" command) but that should be minor. Thanks for the report!