jkobject / scDataLoader

a dataloader to work with large single cell datasets from lamindb
https://www.jkobject.com/scDataLoader/
GNU General Public License v3.0
13 stars 3 forks source link

Fix _index and HGNC symbol indexing #7

Closed mossjacob closed 4 days ago

mossjacob commented 1 week ago

Summary :memo:

Fixes a one-line bugs in the index key extraction of an AnnData store as well as the biomart processor.

Details

jkobject commented 6 days ago

Hello @mossjacob,

thanks for the update! Just one question about the mapped.py update, any explanation? I take this file from lamindb's mapped function. Can you explain what it is changing?

Also, there was quite some updates recently on lamindb, if you want to update it from their version and make a PR, it would be super useful!
https://github.com/laminlabs/lamindb/blame/2e3ebeac61608f1a83f50583890db23bfec595d7/lamindb/core/_mapped_collection.py#L55

Best,

mossjacob commented 5 days ago

Hello @jkobject,

I think it was a bug in that line--the index of var couldn't be accessed as it was. I did not realise that this file was copied from lamin--how come you don't just import MappedCollection from lamindb? I believe lamindb is a dependency of this project?

Best, Jacob

jkobject commented 5 days ago

Initially I had updates that were specific to scdataloader. Now they all got added to lamin. But I have some plans to update this part soon too so I was keeping it like this.

mossjacob commented 5 days ago

Hi,

I've updated the PR to use the latest Lamin mapped collection and added separately the check_aligned_vars function to the scDataLoader Dataset. How does it look to you?

codecov-commenter commented 4 days ago

Welcome to Codecov :tada:

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered :open_umbrella: