scverse / squidpy

Spatial Single Cell Analysis in Python
https://squidpy.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
400 stars 70 forks source link

fixed reading 10x formatted visium mtx files #803

Closed LinearParadox closed 4 months ago

LinearParadox commented 4 months ago

Description

Replaced Anndata's read_mtx function with scanpys native read_10x_mtx function for reading mtx formatted visium experiments. This solves issues with matching the spot positions to barcodes, as well as missing feature names and barcodes, and proper matrix orientation.

How has this been tested?

all tests ran locally

Closes

closes #802

codecov-commenter commented 4 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (4ea8e61) 69.83% compared to head (ce638da) 69.83%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #803 +/- ## ======================================= Coverage 69.83% 69.83% ======================================= Files 39 39 Lines 5497 5497 Branches 1034 1034 ======================================= Hits 3839 3839 Misses 1363 1363 Partials 295 295 ``` | [Files](https://app.codecov.io/gh/scverse/squidpy/pull/803?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scverse) | Coverage Δ | | |---|---|---| | [src/squidpy/read/\_utils.py](https://app.codecov.io/gh/scverse/squidpy/pull/803?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=scverse#diff-c3JjL3NxdWlkcHkvcmVhZC9fdXRpbHMucHk=) | `75.00% <50.00%> (ø)` | |
giovp commented 4 months ago

thank you @LinearParadox ! I wonder if it's related to #797 , @Lem-P could you try out this solution and see if it fixes the issue?

LinearParadox commented 4 months ago

I think it's likely a different issue. When squidpy would read the mtx file, the object would have no gene / barcode names, and the failure would likely be noticed at qc (as you would have only a single peak in spot counts, since they'd be all NA). Do you know if the current test datasets in squidpy have the mtx files floating around somewhere? It would be nice to add some tests to make 100% sure the mtx reading is equivalent with reading h5 files. I'm more than happy to handle writing those!

giovp commented 4 months ago

hi @LinearParadox , you are right, re-reading the issue I think it's due to the header being set wrong.

regarding tests, unfortunately we don;'t have the mtx file in thedatasets, and we don't really have a test suite for readers in squidpy. For spatialdata, we have a robust and rather compute expensive pipeline that ensures that datasets can always be read by the readers and supported by spatialdata objects (set up by @LucaMarconato , cc). I think it's outside the scope of squidpy to set something up like that as we are hoping that users will transition to spatialdata eventually. You can find the (tested) datasets here: https://github.com/scverse/spatialdata-notebooks/tree/main/datasets . I will merge this once CI passes

giovp commented 4 months ago

maybe @LinearParadox can you add a release note? you can add it in the dev release for now (in docs/release_notes) and I will move it to whatever version we do next. Thank you!

LinearParadox commented 4 months ago

Should be done! Let me know if you need me to add more detail or if there's anything else I should modify/add!