Open Angel030331 opened 1 day ago
It looks the "use_gpu" parameter is deprecated in recent package updates, you can use the GPU via the "accelerator" parameter, e.g. mod.train(accelerator='gpu', batch_size=None)
! I found that I needed to set the "batch_size" parameter to "None" in the new update order to fully copy the data to GPU or else it is not able to find the data and throws an error like IndexError: Dimension specified as 0 but tensor has no dimensions
but might just be on my system.
It looks the "use_gpu" parameter is deprecated in recent package updates, you can use the GPU via the "accelerator" parameter, e.g.
mod.train(accelerator='gpu', batch_size=None)
! I found that I needed to set the "batch_size" parameter to "None" in the new update order to fully copy the data to GPU or else it is not able to find the data and throws an error likeIndexError: Dimension specified as 0 but tensor has no dimensions
but might just be on my system.
However, when I try this approach, a ValueError arises instead. Please refer to the attached diagram.
Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.
Problem
the use_gpu parameters cannot be used in NB regression model training ...
N_cells_per_location
anddetection_alpha
.batch_key
for reference NB regression.Description of the data input and hyperparameters
...
DATA_PATH = '/Users/onkiwong/Desktop/Year_4/sem1/BIOF3001/Group_project/datasets/seqFISH+' cellcount = os.path.join(DATA_PATH, 'raw_somatosensory_sc_exp.txt') sp_data = os.path.join(DATA_PATH, 'Out_rect_locations.csv') celltype = os.path.join(DATA_PATH, 'somatosensory_sc_labels.txt')
...
df_celltype = pd.read_csv(celltype, header=None, sep='\t') df_celltype.columns = ['celltype'] df_celltype.index = adata_ref.obs.index adata_ref.obs['Subset'] = df_celltype['celltype'] adata_ref.obs['Method'] = '3GEX' adata_ref.obs['Sample'] = adata_ref.obs_names adata_ref.obs['Sample'] = adata_ref.obs['Sample'].apply(lambda x: x[0:4])
from cell2location.utils.filtering import filter_genes selected = filter_genes(adata_ref, cell_count_cutoff=5, cell_percentage_cutoff2=0.03, nonz_mean_cutoff=1.12)
5, 0.03, 1.12
In our case, a few genes are cut
adata_ref = adata_ref[:, selected].copy()
RegressionModel.setup_anndata(adata=adata_ref, batch_key='Sample', labels_key='Subset')
os.environ["THEANO_FLAGS"] = 'device=cuda,floatX=' + 'float32' + ',force_device=True' + ',dnn.enabled=False' from cell2location.models import RegressionModel mod = RegressionModel(adata_ref)
Use all data for training (validation not implemented yet, train_size=1)
mod.train(max_epochs=400, batch_size=None, train_size=1, lr=0.002, use_gpu=True)
plot ELBO loss history during training, removing first 20 epochs from the plot
mod.plot_history(20)
Single cell reference data: number of cells, number of cell types, number of genes
...
Single cell reference data: technology type (e.g. mix of 10X 3' and 5')
...
Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)
...