BayraktarLab / cell2location

Comprehensive mapping of tissue cell architecture via integrated single cell and spatial transcriptomics (cell2location model)
https://cell2location.readthedocs.io/en/latest/
Apache License 2.0
319 stars 58 forks source link

RegressionModel use_gpu parameters not working #392

Open Angel030331 opened 1 day ago

Angel030331 commented 1 day ago

Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.

Problem

the use_gpu parameters cannot be used in NB regression model training ...

Description of the data input and hyperparameters

...

DATA_PATH = '/Users/onkiwong/Desktop/Year_4/sem1/BIOF3001/Group_project/datasets/seqFISH+' cellcount = os.path.join(DATA_PATH, 'raw_somatosensory_sc_exp.txt') sp_data = os.path.join(DATA_PATH, 'Out_rect_locations.csv') celltype = os.path.join(DATA_PATH, 'somatosensory_sc_labels.txt')

...

df_celltype = pd.read_csv(celltype, header=None, sep='\t') df_celltype.columns = ['celltype'] df_celltype.index = adata_ref.obs.index adata_ref.obs['Subset'] = df_celltype['celltype'] adata_ref.obs['Method'] = '3GEX' adata_ref.obs['Sample'] = adata_ref.obs_names adata_ref.obs['Sample'] = adata_ref.obs['Sample'].apply(lambda x: x[0:4])

from cell2location.utils.filtering import filter_genes selected = filter_genes(adata_ref, cell_count_cutoff=5, cell_percentage_cutoff2=0.03, nonz_mean_cutoff=1.12)

5, 0.03, 1.12

In our case, a few genes are cut

adata_ref = adata_ref[:, selected].copy()

RegressionModel.setup_anndata(adata=adata_ref, batch_key='Sample', labels_key='Subset')

os.environ["THEANO_FLAGS"] = 'device=cuda,floatX=' + 'float32' + ',force_device=True' + ',dnn.enabled=False' from cell2location.models import RegressionModel mod = RegressionModel(adata_ref)

Use all data for training (validation not implemented yet, train_size=1)

mod.train(max_epochs=400, batch_size=None, train_size=1, lr=0.002, use_gpu=True)

plot ELBO loss history during training, removing first 20 epochs from the plot

mod.plot_history(20)

Screenshot 2024-11-04 at 16 51 17

Single cell reference data: number of cells, number of cell types, number of genes

...

Single cell reference data: technology type (e.g. mix of 10X 3' and 5')

...

Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)

...

danielgchen commented 19 hours ago

It looks the "use_gpu" parameter is deprecated in recent package updates, you can use the GPU via the "accelerator" parameter, e.g. mod.train(accelerator='gpu', batch_size=None)! I found that I needed to set the "batch_size" parameter to "None" in the new update order to fully copy the data to GPU or else it is not able to find the data and throws an error like IndexError: Dimension specified as 0 but tensor has no dimensions but might just be on my system.

Angel030331 commented 11 hours ago

It looks the "use_gpu" parameter is deprecated in recent package updates, you can use the GPU via the "accelerator" parameter, e.g. mod.train(accelerator='gpu', batch_size=None)! I found that I needed to set the "batch_size" parameter to "None" in the new update order to fully copy the data to GPU or else it is not able to find the data and throws an error like IndexError: Dimension specified as 0 but tensor has no dimensions but might just be on my system.

However, when I try this approach, a ValueError arises instead. Please refer to the attached diagram.

Screenshot 2024-11-05 at 10 34 41

image