Open LustigePerson opened 5 years ago
From an api design standpoint, we try to keep AnnData
non-specific to single cell. From a process standpoint, the 10x reader was implemented there and never moved. If the function was to move here, we'd probably want to rewrite it first so we wouldn't be adding the tables
dependency to AnnData.
Thank you for your response. I was just wondering because all the other readers are located in AnnData and just loaded to scanpy. But I understand that this is a design decision.
I mean we can discuss this – is there a reason we don’t want the reader in here?
For me it would make sense, as I might want to read data into the anndata format without the need to load the whole scanpy package. But as I understood from @ivirshup this was a design descision.
I doubt that it was. An argument can be made that 10x is more single-cell-transcriptomics specific than anndata itself, but I’m not aware of e.g. loom being used in a different way, so …
Hey! Yes, it was a design decision: the idea was that anndata is not limited to biological omics data just as loom. scanpy, by contrast, is.
These days, I'm not opposed to making it available from anndata, though. Even if we have 20 or 30 readers, I wouldn't say we have a cluttered API.
I’d say that the only reason for a read function to be scanpy-specific is if it would create scanpy-specific conventions in the AnnData object (such as obsm['X_pca']
or so), but they don’t.
I think it would be reasonable to be doing more with 10x files (where CITE-seq gets placed). I'd also want to see if we're going to be doing stuff with the visium data, and what those files look like.
One other issue is that the current 10x readers use tables
not h5py
and I'd prefer not to add tables
as a dependency here. We could rewrite them, but I don't think this is a super high priority – especially for the legacy readers.
I just opened a similar issue at Scanpy. It would be really great to have all the readers in one place -- even if it's in a standalone scio
package, which would have functions other methods developers could export into their own packages.
This issue has been automatically marked as stale because it has not had recent activity. Please add a comment if you want to keep the issue open. Thank you for your contributions!
Let’s track this in https://github.com/scverse/scverse-io/issues/5
I'd rather keep this on track as there's an open PR which fixes this (@gtca, please take a look), and the referenced issue doesn't really track a decision on where this function goes.
I was just wondering if there is a specific reason why the 10x h5 reader function is not implemented in anndata. It would be great if this format could be loaded without the need to load the whole scanpy package first. Most other readers in scanpy are just loaded from anndata.