I'm trying to train topic models on gene expression and ATAC data.
Even with GPU, I'm finding this very slow particularly for the ATAC data. - for this i've slimmed the data down to ~10K peaks from 50K cells, but ideally would like to use closer to 100K peaks.
The tutorial suggests caching data to disk
model.write_ondisk_dataset(train, dirname = './....' is taking several hours,
equally model.get_learning_rate_bounds is taking ~5 hours.
that's before we even get to .fit()
the output of import torch torch.cuda.is_available()
is True
Is this expected behaviour?
Do you have suggestions for speedup?
I'm trying to train topic models on gene expression and ATAC data. Even with GPU, I'm finding this very slow particularly for the ATAC data. - for this i've slimmed the data down to ~10K peaks from 50K cells, but ideally would like to use closer to 100K peaks. The tutorial suggests caching data to disk
model.write_ondisk_dataset(train, dirname = './....'
is taking several hours, equallymodel.get_learning_rate_bounds
is taking ~5 hours. that's before we even get to.fit()
the output of
import torch torch.cuda.is_available()
isTrue
Is this expected behaviour? Do you have suggestions for speedup?