BodenmillerGroup / ImcSegmentationPipeline

A pixel classification based multiplexed image segmentation pipeline
https://bodenmillergroup.github.io/ImcSegmentationPipeline/
MIT License
84 stars 35 forks source link

Avoid processing of hidden files #111

Closed nilseling closed 1 year ago

nilseling commented 1 year ago

We need to figure out how to not process hidden files such as .Patient1.zip. The regex [!.]*Patient*.zip did not work. @jwindhager could you propose an alternative?

jwindhager commented 1 year ago

Works for me:

❯ ls -ah
.  ..  .MyPatient123.zip  MyPatient123.zip
>>> [p for p in Path(".").rglob("[!.]*Patient*.zip")]
[PosixPath('MyPatient123.zip')]

Also, note that these are unix-style wildcards, not regular expressions.

nilseling commented 1 year ago

That also works for me. But what about [p for p in Path(".").rglob("[!.]*MyPatient*.zip")]? That's what we we have in the pre-processing script.

jwindhager commented 1 year ago

Oh I see, so the [!.] pattern does not match the empty sequence. I'm afraid I don't know how to work around this, other than replacing the unix-style wildcards (used by pathlib's glob functions) with regular expressions as implemented in re. Is this only about the specific example data? In that case, I'd propose to change the default pattern to something like [!.]*.zip.

nilseling commented 1 year ago

Personally, I find regular expressions more intuitive than these wildcards. On the long run it might be worth switching.

Milad4849 commented 1 year ago

Is this already addressed? [p for p in Path(".").rglob("[!.]*MyPatient*.zip")] is no longer in the script.

nilseling commented 1 year ago

Not really, we would need to play around with hidden files in the pipeline again and see which exclusion pattern really works.

Milad4849 commented 1 year ago

is this too crude of a solution?

mcd_files = list(raw_dir.rglob("*patient*.mcd"))
mcd_files=[(i) for i in mcd_files if not i.stem.startswith('.')]
nilseling commented 1 year ago

@Milad4849 sorry I missed this. Has this been resolved now?

Milad4849 commented 1 year ago

The commit fixes it, only tested on my end.

nilseling commented 1 year ago

Ok, but you committed it to main, right? So it should be part of the pipeline. Should we close this issue now?

Milad4849 commented 1 year ago

Yes, in hindsight should have been committed to develop. Can close the issue now.

nilseling commented 1 year ago

Closing now as I have lost track on this issue - we might need to reopen in the future