nipy / heudiconv

Flexible DICOM conversion into structured directory layouts
https://heudiconv.readthedocs.io
Other
232 stars 125 forks source link

datalad sensitive marking fixes #739

Closed bpinsard closed 6 months ago

bpinsard commented 6 months ago

Mark sensitive files added in the commit not all globbed.

Currently all globbed files in the current tree get marked as sensitive which could overwrite previously removed metas (eg after defacing), it now filter file in the commit where converted files are added.

Add metadata rather than init to fix reruns issue.

If one runs heudiconv on a branch, remove the metadata (eg after defacing) then remove that branch and then rerun the same conversion on the same dataset, it seems like the new files with same names inherit the absence of metadata from the previous files (from git-annex branch). This is unexpected as the git-annex hash should differ so .met file should not be picked-up, but there is some git-annex black magic going on here. The issue is that sensitive files do not get marked as non-sensitive.

codecov[bot] commented 6 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (d3385d7) 81.97% compared to head (78db99e) 82.01%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #739 +/- ## ========================================== + Coverage 81.97% 82.01% +0.04% ========================================== Files 41 41 Lines 4149 4159 +10 ========================================== + Hits 3401 3411 +10 Misses 748 748 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

bpinsard commented 6 months ago

I don't think it should be limited to the one added, eg. if heudiconv is run with overwrite (and overwritten annexed files are unlocked) we want to (re) mark the updated annexed files as sensitive in case the metadata was removed on the overwritten files.

yarikoptic commented 6 months ago

I don't think it should be limited to the one added,

here "add" is more like git add -- something was committed since was modified so it should have been fine too

❯ echo 123 >> 123; datalad save 123
add(ok): sub-YutaMouse20/123 (file)                                                                 
save(ok): . (dataset)                                                                               
action summary:                                                                                     
  add (ok: 1)
  save (ok: 1)
❯ echo 123 >> 123; datalad save 123
add(ok): sub-YutaMouse20/123 (file)                                                                 
save(ok): . (dataset)   

well, let's hope that with filtering on globs nothing unexpected would happen -- let's try it out!

github-actions[bot] commented 6 months ago

:rocket: PR was released in v1.0.2 :rocket: