Closed willey2020 closed 2 years ago
Hi, sorry for the late response. I was trying to improve scBasset for the last few month. I updated scBasset package with new tutorial. Yes, scBasset can be applied to non-10x input. As long as you convert your peak-by-cell matrix into anndata format, where the .var contains "chr", start", and "end" columns. scBasset can do the preprocess.
Thank you so much!! Will try!
Hello! Thank you again for your answer! Sorry for bother again,
Could you give a quick suggestion about how to input non 10X Buenrostro's raw data.
I follow your advice
instead of using sc.read_10x_h5, I use sc.read_csv to convert presaved peak matrix from Signac to anndata format, everything looks fine till I met trouble in preprocessing.py, showing Traceback (most recent call last):
File "/home/gougou/scBasset/bin/scbasset_preprocess.py", line 60, in
I found that in the adh5 count from 10X input's anndata a sparse csr_matrix, but my anndata shows ndarray type.
I think the example Buenrostro's raw data is non 10X version, which I guess it didn't go through sc.read_10x_h5. Could you provide any guidance of how to import that type of data into anndata that can accepted by your downstream pipeline. Thank you! Thank you so much!
I just fix the issue. The reason I met, is that I should transform the ndarray into sparse matrix format inside the anndata, I use the function sparse.csr_matrix(ad) to convert it to sparse matrix and then preprocessing it works. Thank you very much again and sorry for bothering with this small issue.
Hello! Thank you again for your answer! Sorry for bother again, Could you give a quick suggestion about how to input non 10X Buenrostro's raw data. I follow your advice instead of using sc.read_10x_h5, I use sc.read_csv to convert presaved peak matrix from Signac to anndata format, everything looks fine till I met trouble in preprocessing.py, showing Traceback (most recent call last): File "/home/gougou/scBasset/bin/scbasset_preprocess.py", line 60, in main() File "/home/gougou/scBasset/bin/scbasset_preprocess.py", line 54, in main make_h5_sparse(ad, '%s/all_seqs.h5'%output_path, input_fasta) File "/home/gougou/scBasset/scbasset/utils.py", line 136, in make_h5_sparse m = m.tocoo().transpose().tocsr() AttributeError: 'numpy.ndarray' object has no attribute 'tocoo'
I found that in the adh5 count from 10X input's anndata a sparse csr_matrix, but my anndata shows ndarray type.
I think the example Buenrostro's raw data is non 10X version, which I guess it didn't go through sc.read_10x_h5. Could you provide any guidance of how to import that type of data into anndata that can accepted by your downstream pipeline. Thank you! Thank you so much!
Hi,
Glad the problem is solved. Ye, scBasset assumes anndata.X is in sparse format. I'll add a note to the readme file. Thanks!
Thank you again!
Hello! Could I ask for a question regarding running data set in non-10X version? like bam files or only a peak count matrix, is there any way to import them into scBasset? Thank you very much!