Closed pcm32 closed 5 years ago
Thanks! Is “else if” valid in python 2? I don’t care too much but I still try to at least have the file parse in py2.
On Thu 11 Jul 2019 at 09:11, Pablo Moreno notifications@github.com wrote:
This PR adds a further case presented by Scanpy where gene symbols are present in a dictionary inside ad.var['gene_symbols'], so that Gene symbols get rescued when transforming to CellBrowser objects.
You can view, comment on, or merge this pull request online at:
https://github.com/maximilianh/cellBrowser/pull/118 Commit Summary
- Case for gene_symbols in var from anndata
File Changes
- M src/cbPyLib/cellbrowser/cellbrowser.py https://github.com/maximilianh/cellBrowser/pull/118/files#diff-0 (4)
Patch Links:
- https://github.com/maximilianh/cellBrowser/pull/118.patch
- https://github.com/maximilianh/cellBrowser/pull/118.diff
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/118?email_source=notifications&email_token=AACL4TNM4ME3STI2YAGDVCDP65LRVA5CNFSM4IBMHUO2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G6VT66Q, or mute the thread https://github.com/notifications/unsubscribe-auth/AACL4TIF4OU3QTWHP6TVDRDP65LRVANCNFSM4IBMHUOQ .
ooops... sorry! fixed now
@ivirshup so it turns out that the reason we haven't come across this before is that it's not really a scanpy standard, but only the read_10X function produces it. https://github.com/theislab/scanpy/issues/385 But that's the main exchange format, so many thanks for this! I'll try to test it a little now.
This brings up a problem in my testing. I wasn't using a 10X file, but a simple .tsv that I prepared. I'll switch to Scanpy's pbmc 10X sample file for testing.
Hmm... scanpy's test matrix is also not a 10X file, rather an h5ad. Given how many breaking changes scanpy is going through, I wonder if this is a good idea. Don't we have a small-ish expression matrix somewhere for testing?
After a scanpy/anndata/pandas update, and the scanpy pbmc small test, my minimal test is now broken... sigh...
In the test suite for scanpy, there are a few small 10x datasets (scanpy/tests/_data/10x_data
). I think these were cut down a bit, so I'm not sure they're 100% compliment to what cellranger puts out. The best option would probably be to test against something from the 10x example datasets.
I'm a little confused about what's happening here, are you saying scanpy is reading a 10x file and generating a dataframe with dictionaries in the columns?
Maybe it's a good time to add some travis or circle-ci testing?
Thanks Isaac! I’ll try that. I tried the h5ad file in the scanpy test directory and got the new NaN,NaN,NaN error when finding most variable genes.
Then tried a 1k 10x sample from the 10x homepage and got a “cannot find GrCh38” error as there is only a group called “matrix” in cellranger 3 h5 files. Does scanpy support cellranger 3 ?
I guess I should upgrade scanpy and anndata to the current master before trying again.
On Sat 13 Jul 2019 at 04:31, Pablo Moreno notifications@github.com wrote:
Maybe it's a good time to add some travis or circle-ci testing?
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/118?email_source=notifications&email_token=AACL4TIOWGUMGTRZ2UBUT3LP7G4INA5CNFSM4IBMHUO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ3P4AI#issuecomment-511114753, or mute the thread https://github.com/notifications/unsubscribe-auth/AACL4TJJN4QB6F5ULKBGUPTP7G4INANCNFSM4IBMHUOQ .
With v3, you shouldn't have to specify genome (you shouldn't have to specify genome for v2 either if only one genome is there). I'm able to read in a v3 h5 file with currently released versions of scanpy and anndata using sc.read_10x_h5
.
Giving an example of what worked for me:
import scanpy as sc
!wget http://cf.10xgenomics.com/samples/cell-exp/3.0.2/1k_hgmm_v3/1k_hgmm_v3_filtered_feature_bc_matrix.h5
adata = sc.read_10x_h5("./1k_hgmm_v3_filtered_feature_bc_matrix.h5")
The returned adata
is a view, which is weird, but otherwise this seems to work.
Many thanks, I must have an older version of anndata or must have done something else wrong. Thanks for your help Isaac!
On Sun 14 Jul 2019 at 19:36, Isaac Virshup notifications@github.com wrote:
Giving an example of what worked for me:
import scanpy as sc!wget http://cf.10xgenomics.com/samples/cell-exp/3.0.2/1k_hgmm_v3/1k_hgmm_v3_filtered_feature_bc_matrix.h5 adata = sc.read_10x_h5("./1k_hgmm_v3_filtered_feature_bc_matrix.h5")
The returned adata is a view, which is weird, but otherwise this seems to work.
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/pull/118?email_source=notifications&email_token=AACL4TPJNLT4K2WNNRMWYHLP7PPBDA5CNFSM4IBMHUO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ4TFSY#issuecomment-511259339, or mute the thread https://github.com/notifications/unsubscribe-auth/AACL4TOACG62K6WAQXTLQ23P7PPBDANCNFSM4IBMHUOQ .
This PR adds a further case presented by Scanpy where gene symbols are present in a dictionary inside
ad.var['gene_symbols']
, so that Gene symbols get rescued when transforming to CellBrowser objects.