Starlitnightly / omicverse

A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
https://starlitnightly.github.io/omicverse/
GNU General Public License v3.0
431 stars 46 forks source link

ValueError: cannot reshape array of size 87692 into shape (1993,) in ov.bulk2single.Bulk2Single #23

Closed Jessica1525809 closed 5 months ago

Jessica1525809 commented 1 year ago

When I tried to run "ov.bulk2single.Bulk2Single", an error arose " ValueError: cannot reshape array of size 87692 into shape (1993,)"

Following are my code and error:

bulk_data=pd.read_csv('./merged_gene_count_rpkm_by_gene.csv',index_col=0) single_data=sc.read_h5ad('./adata.h5ad')

model=ov.bulk2single.Bulk2Single( bulk_data=bulk_data, single_data=single_data, celltype_key='Celltype' )


ValueError Traceback (most recent call last) Cell In[57], line 1 ----> 1 model=ov.bulk2single.Bulk2Single( 2 bulk_data=bulk_data, single_data=single_data, celltype_key='Celltype' 3 )

File ~/.local/lib/python3.9/site-packages/omicverse/bulk2single/_bulk2single.py:39, in Bulk2Single.init(self, bulk_data, single_data, celltype_key, top_marker_num, ratio_num, gpu) 37 self.celltype_key=celltype_key 38 self.input_data=bulk2single_data_prepare(bulk_data,single_data,celltype_key) ---> 39 self.cell_target_num = data_process(self.input_data, top_marker_num, ratio_num) 40 if gpu=='mps' and torch.backends.mps.is_available(): 41 print('Note that mps may loss will be nan, used it when torch is supported')

File ~/.local/lib/python3.9/site-packages/omicverse/bulk2single/_utils.py:237, in data_process(data, top_marker_num, ratio_num) 234 single_cell_matrix = np.transpose(single_cell_matrix) # (gene_num, label_num) 236 bulk_marker = bulk_marker.values # (gene_num, 1) --> 237 bulk_rep = bulk_marker.reshape(bulk_marker.shape[0], ) 239 # calculate celltype ratio in each spot by NNLS 240 ratio = nnls(single_cell_matrix, bulk_rep)[0]

ValueError: cannot reshape array of size 87692 into shape (1993,)

I will appreciate it if I can get help from you!

Starlitnightly commented 1 year ago

Hi,

Can you show the data format of the bulk? Use data.head() to get.

Jessica1525809 commented 1 year ago

My bulk data is a data frame with rows named genes, columns named samples, and values RPKM.

截屏2023-09-12 23 22 55

Thank you!

Starlitnightly commented 1 year ago

Hi, @Jessica1525809

You can download the reference data for comparison. Typically, the raw count is used and deseq2 normalized and logarithmized, then the sample mean is calculated. May I ask if you have compared the difference between your data and the reference data?

Jessica1525809 commented 1 year ago

I can run the reference data without any problems. I extracted the expression matrix of the single-cell data and recreated the adata object according to the tutorials. But the same error occurred. I failed to find the difference between my data and the reference data. Here is my code.

截屏2023-09-13 18 51 33 截屏2023-09-13 18 52 03 截屏2023-09-13 18 52 20
Starlitnightly commented 1 year ago

Hi, @Jessica1525809

In reference bulk data, there're only one sample.

Starlitnightly commented 1 year ago

Hi, @Jessica1525809

We have updated omicverse to version 1.5.1, in which we have optimized the parameters of Bulk2single, and you are welcome to re-experience it.

https://omicverse.readthedocs.io/en/latest/Tutorials-bulk2single/t_bulk2single/