allow for incomplete data ingestion, fix ypix bug, add mean image to caiman seg class

RichieHakim commented 1 year ago

This PR contains focuses on some changes to the Suite2pSegmentationExtractor class. The following changes are made:

Now allows for incomplete data ingestion. New argument (allow_incomplete_import). If any files are missing (ie F.npy is not there), it can simply ignore it an move on. Also, a new argument (warn_missing_files) toggles printing warnings for missing files.
New arguments for searching within a subdirectory specified by the plane number. This is often not how directories are structured and this toggle removes the user burden of restructuring directories if they choose not to.
Potential bug fix in the orientation of roi_masks. It seems that the code was writen to flip the roi_masks in the y direction. I'm not sure why this is the case. I removed code that ingested this way and code that undid this during saving.
Typo fix in readme.
Add mean image extraction from caiman segmentation class.

RichieHakim commented 1 year ago

Transposing the NWB summary image may also be necessary. I can add that to this PR if you agree @CodyCBakerPhD Screenshot from 2023-06-18 17-29-13

CodyCBakerPhD commented 1 year ago

@RichieHakim Sorry I haven't had the bandwidth to get to this until now - I'm personally spread across a number of projects and this corner of the ecosystem has often not gotten as much attention (unless we need to add a new format, such as the recent additions of Miniscope and Bruker)

To alleviate this, we'll be bringing in @pauladkisson who will become the key point person for reviewing, dev ops, and other miscellaneous admin responsibilities; I'll be bringing them up to speed on this repository in the coming weeks

In the meantime, I have a couple concerns about this PR being more complicated than it needs to be

We certainly want the SegmentationExtractor for Suite2P to not fail on some target output, and given that you indicate it is possible to have an absence of certain files in your output, I think the general request to enable safe loading of individual streams (potential trace files) is perfectly reasonable

We generally try to avoid try/except patterns as a style whenever possible however, and I'd say it is perfectly acceptable for the time being to simply skip the loading of the numpy file conditional on whether or not the file exists on the system; this reduces the code changes here by quite a bit while maintaining consistency of trace dictionary behavior in the current paradigm (though the very notion of the trace dictionary is something we'd eventually like to refactor out)

Potential bug fix in the orientation of roi_masks. It seems that the code was writen to flip the roi_masks in the y direction.

What I can say in general is that ROIExtractors follows image (height x width) shape convention similar to OpenCV and other image parsing libraries

Unfortunately, this often conflicts with the NWB equivalent, whose schema mandates usage of the (width x height) data arrays instead; hence, why the NWB extractor for any of these fields will also transpose the data from the source

However, given how long ago this class was implemented, and how the individual who originally implemented it is no longer active, I can't actually judge if it is correct or not

Now, if you're saying the flipping being done at https://github.com/catalystneuro/roiextractors/pull/227/files#diff-cdefaf6b12d83c6942b1dcdc3245b73ef75445fac2293e7ca046be26a77d720fL118 and https://github.com/catalystneuro/roiextractors/pull/227/files#diff-cdefaf6b12d83c6942b1dcdc3245b73ef75445fac2293e7ca046be26a77d720fL176-R207 is a bug as far as you can tell, can you do me a favor and raise a separate issue for that and remove the changes from this PR (so this PR can focus purely on the partial loading of streams +/- the Caiman summary improvement?); then we can do a deep dive using some source images as reference to confirm if it's intentional or not

Typo fix in readme.

Thanks, if you're open to making the changes as I've suggested, can you also update the CHANGELOG.md file describing the changes of this PR?

Add mean image extraction from caiman segmentation class.

Back to the question of testing; do you know if the example files on the GIN repo have these new fields that you're adding a functionality for? And if not can you share a version of your files (or a stubbed one that is < 10 MB) that you developed it for?

RichieHakim commented 1 year ago

@CodyCBakerPhD

Thanks Cody! I appreciate that you all are maintaining a pretty large amount of code and projects so I'm thankful that you were able to address this PR. Your comments were good.

There are a few changes in here so I'll go one by one

Suite2p class:

supporting incomplete ingestion: This is critical and it looks like your changes support this
Non-strict directory structure for plane numbers: This is not critical and I'm cool with leaving it as is.

CaImAn class:

Fortran ordering of reshaping: I'm 95% sure this is correct but I'd suggest some further testing with new datasets.

NWB class:

transposition of image: I can't really conclude on the code links you sent. I'm not sure. This is not data I handle myself. I just plotted the results and it clearly showed that the image was transposed.

Changelog: Will do later today.

Data files:

Back to the question of testing; do you know if the example files on the GIN repo have these new fields that you're adding a functionality for? And if not can you share a version of your files (or a stubbed one that is < 10 MB) that you developed it for?

I don't have access to personally extracted caiman files. The example caiman files on the GIN repo do not have a mean image. I think a recurring theme here is that we are unable to test changes effectively because we don't trust the GIN repo data. It might be a good idea to invest a little bit of time and rerun some example data through the different extraction pipelines (and maybe make a simple pipeline to run new tests as updates to those repos occur).

I think it might be useful to provide a little more context to how I (and others) have been using this repo. Here is the ingestion class I'm wrapping roiextractors classes with: https://github.com/RichieHakim/ROICaT/blob/dev/roicat/data_importing.py#L1429 and here is a demo on how it is being used: https://github.com/RichieHakim/ROICaT/blob/main/notebooks/jupyter/other/demo_data_importing.ipynb

Let me know if there is anything else I can help with. Thanks again!

CodyCBakerPhD commented 1 year ago

transposition of image: I can't really conclude on the code links you sent. I'm not sure. This is not data I handle myself. I just plotted the results and it clearly showed that the image was transposed.

I see; thanks for the report - given this PR already has a couple other things going for it, I say let's do a follow-up that specifically checks it for consistency across our repo

As you say, generating some up-to-date examples from each pipeline using some standard base data could be very helpful in this endeavor so I'll task @pauladkisson (who will also be improving the documentation for our project) with handling this in the coming weeks

RichieHakim commented 1 year ago

What is the status of this PR? Are we waiting on anything else before merging? Happy to help if you point me in the right direction.

CodyCBakerPhD commented 1 year ago

What is the status of this PR?

Thanks for checking in - just waiting on the cleanup of the unneeded lines pointed out in my suggestions above

pauladkisson commented 1 year ago

@RichieHakim, circling back around to this. Do you need any clarification on the suggestions Cody gave?

CodyCBakerPhD commented 1 year ago

@weiglszonja Can you confirm that most of the approved features from this request were merged in with #234? If so, this can likely be closed

weiglszonja commented 1 year ago

These are addressed in #242

Not added in #242:

readme fix (CNNM-E)
Add mean image extraction from caiman segmentation class.

CodyCBakerPhD commented 1 year ago

@weiglszonja Thanks for summarizing

Not added in https://github.com/catalystneuro/roiextractors/pull/242:

readme fix (CNNM-E) Add mean image extraction from caiman segmentation class.

Looks like the README thing is no longer a problem (fixed at some point on main)

@alessandratrapani Can you take a crack at implementing isolated fixes to the Caiman images as seen from this PR?

codecov[bot] commented 9 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 79.37%. Comparing base (0f46b2c) to head (1e2ce74).

Additional details and impacted files

[![Impacted file tree graph](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227/graphs/tree.svg?width=650&height=150&src=pr&token=UA958XVSZP&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro)](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro) ```diff @@ Coverage Diff @@ ## main #227 +/- ## ========================================== + Coverage 79.33% 79.37% +0.04% ========================================== Files 39 39 Lines 3063 3069 +6 ========================================== + Hits 2430 2436 +6 Misses 633 633 ``` | [Flag](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro) | `79.37% <100.00%> (+0.04%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro) | Coverage Δ | | |---|---|---| | [...s/extractors/caiman/caimansegmentationextractor.py](https://app.codecov.io/gh/catalystneuro/roiextractors/pull/227?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=catalystneuro#diff-c3JjL3JvaWV4dHJhY3RvcnMvZXh0cmFjdG9ycy9jYWltYW4vY2FpbWFuc2VnbWVudGF0aW9uZXh0cmFjdG9yLnB5) | `85.18% <100.00%> (+0.87%)` | :arrow_up: |

catalystneuro / roiextractors

allow for incomplete data ingestion, fix ypix bug, add mean image to caiman seg class #227

Codecov Report