Closed nilseling closed 1 year ago
Works for me:
❯ ls -ah
. .. .MyPatient123.zip MyPatient123.zip
>>> [p for p in Path(".").rglob("[!.]*Patient*.zip")]
[PosixPath('MyPatient123.zip')]
Also, note that these are unix-style wildcards, not regular expressions.
That also works for me. But what about [p for p in Path(".").rglob("[!.]*MyPatient*.zip")]
? That's what we we have in the pre-processing script.
Oh I see, so the [!.]
pattern does not match the empty sequence. I'm afraid I don't know how to work around this, other than replacing the unix-style wildcards (used by pathlib
's glob functions) with regular expressions as implemented in re
. Is this only about the specific example data? In that case, I'd propose to change the default pattern to something like [!.]*.zip
.
Personally, I find regular expressions more intuitive than these wildcards. On the long run it might be worth switching.
Is this already addressed? [p for p in Path(".").rglob("[!.]*MyPatient*.zip")]
is no longer in the script.
Not really, we would need to play around with hidden files in the pipeline again and see which exclusion pattern really works.
is this too crude of a solution?
mcd_files = list(raw_dir.rglob("*patient*.mcd"))
mcd_files=[(i) for i in mcd_files if not i.stem.startswith('.')]
@Milad4849 sorry I missed this. Has this been resolved now?
The commit fixes it, only tested on my end.
Ok, but you committed it to main, right? So it should be part of the pipeline. Should we close this issue now?
Yes, in hindsight should have been committed to develop. Can close the issue now.
Closing now as I have lost track on this issue - we might need to reopen in the future
We need to figure out how to not process hidden files such as
.Patient1.zip
. The regex[!.]*Patient*.zip
did not work. @jwindhager could you propose an alternative?