Open pandaqiuqiu opened 3 weeks ago
Hi,
Please run adata = adata.raw.to_adata()
to get raw counts in adata.X before all steps as both the sc.pp.highly_variable_genes
(when using flavor='seurat_v3') and scTour under default mode expect raw UMI counts. Please let me know if you have any further questions.
@LiQian-XC
Thanks for your fast response. When running run adata = adata.raw.to_adata() at the beginning, it encounters a new error as follows:
tnode = sct.train.Trainer(adata, loss_mode='nb', alpha_recon_lec=0.5, alpha_recon_lode=0.5) tnode.train()
AttributeError Traceback (most recent call last) Cell In[16], line 2 1 tnode = sct.train.Trainer(adata, loss_mode='nb', alpha_recon_lec=0.5, alpha_recon_lode=0.5) ----> 2 tnode.train()
File ~/.../sctour/lib/python3.10/site-packages/sctour/train.py:258, in Trainer.train(self) 254 def train(self): 255 """ 256 Model training. 257 """ --> 258 self._get_data_loaders() 260 params = filter(lambda p: p.requires_grad, self.model.parameters()) 261 self.optimizer = torch.optim.Adam(params, lr = self.lr, weight_decay = self.wt_decay, eps = self.eps)
File ~/.../sctour/lib/python3.10/site-packages/sctour/train.py:245, in Trainer._get_data_loaders(self) 240 """ 241 Generate Data Loaders for training and validation datasets. 242 """ 244 train_data, val_data = split_data(self.adata, self.percent, self.val_frac) --> 245 self.train_dataset = MakeDataset(train_data, self.loss_mode) 246 self.val_dataset = MakeDataset(val_data, self.loss_mode) 248 # sampler = BatchSampler(train_data.n_obs, self.batch_size, self.drop_last) 249 # self.train_dl = DataLoader(self.train_dataset, batch_sampler = sampler)
File ~/miniconda3/envs/sctour/lib/python3.10/site-packages/sctour/data.py:99, in MakeDataset.init(self, adata, loss_mode) 97 X = np.log1p(X) 98 if sparse.issparse(X): ---> 99 X = X.A 100 self.data = torch.tensor(X) 101 self.library_size = self.data.sum(-1)
AttributeError: 'SparseCSRView' object has no attribute 'A'
Hi,
Can you try the following steps to see whether it works?
from scipy.sparse import csr_matrix
adata.X = csr_matrix(adata.X)
Please let me know if you have any other questions.
@LiQian-XC After running from scipy.sparse import csr_matrix adata.X = csr_matrix(adata.X), the same issue persisted. However, I tried using adata.X = adata.X.toarray() later, and it solved the problem. Thank you for your prompt response.
@LiQian-XC : we are having what I assume is an analogous issue. I was the
AttributeError: 'SparseCSRView' object has no attribute 'X' error just like above, when calling train(). The csr_matrix() solution did not work. FWIW, our code is subsetting the adata object right before training, and I suspect this is converting into this this View class:
adataObj = adataObj[:, list(set(adataObj.var_names) - set(exclusionList))]
tnode = sct.train.Trainer(adataObj)
My guess is that something about sutsetting is converting the AnnData object into a view of the data, and that isnt interacting well with scTour. Do you have any debugging suggestions or tests on the anndata object to verify that theory?
Hi,
Can you try to copy the data when subsetting (please see your example below)?
adataObj = adataObj[:, list(set(adataObj.var_names) - set(exclusionList))].copy()
I think this may address this issue and please let me know if it does not work.
Thanks for the idea. Yes, after posting I came to the same conclusion. Tests are running on the code here: https://github.com/bimberlabinternal/CellMembrane/blob/407adf4f1d998af41c1de79f257e94bfe256d0ee/inst/scripts/run_sctour.py#L31
If this is a solution, would you consider adding this kind of test directly to scTour?
Thanks for sharing your code. I will consider adding this in a new version of scTour.
Hi, Qian,
I am using adata as input of the h5ad file converted from Seurat. Adding data from RNA as X, adding counts from RNA as raw, transferring meta.data to obs.
During model training, I encountered the following error at the step involving sct.train.Trainer(). Even after adding the step adata.X <- adata.raw.X, the issue persists. Can you help me solve this problem? Thanks so much!
Related codes ad follows: adata.X <Compressed Sparse Row sparse matrix of dtype 'float64' with 486880 stored elements and shape (5000, 1000)> adata.raw.X <Compressed Sparse Row sparse matrix of dtype 'float64' with 17799636 stored elements and shape (5000, 33694)>
sc.pp.calculate_qc_metrics(adata, percent_top=None, log1p=False, inplace=True) sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=1000, subset=True)
/.../python3.10/site-packages/scanpy/preprocessing/_highly_variable_genes.py:75: UserWarning:
flavor='seurat_v3'
expects raw count data, but non-integers were found. warnings.warn(tnode = sct.train.Trainer(adata, loss_mode='nb', alpha_recon_lec=0.5, alpha_recon_lode=0.5) tnode.train()
ValueError Traceback (most recent call last) Cell In[36], line 1 ----> 1 tnode = sct.train.Trainer(adata, loss_mode='nb', alpha_recon_lec=0.5, alpha_recon_lode=0.5) 2 tnode.train()
File /.../python3.10/site-packages/sctour/train.py:168, in Trainer.init(self, adata, percent, n_latent, n_ode_hidden, n_vae_hidden, batch_norm, ode_method, step_size, alpha_recon_lec, alpha_recon_lode, alpha_kl, loss_mode, nepoch, batch_size, drop_last, lr, wt_decay, eps, random_state, val_frac, use_gpu) 166 X = self.adata.X.data if sparse.issparse(self.adata.X) else self.adata.X 167 if (X.min() < 0) or np.any(~np.equal(np.mod(X, 1), 0)): --> 168 raise ValueError( 169 f"Invalid expression matrix in
.X
.{self.loss_mode}
mode expects raw UMI counts in.X
of the AnnData." 170 ) 172 self.n_cells = adata.n_obs 173 self.batch_size = batch_sizeValueError: Invalid expression matrix in
.X
.nb
mode expects raw UMI counts in.X
of the AnnData.