theislab / scarches

Reference mapping for single-cell genomics
https://docs.scarches.org/en/latest/
BSD 3-Clause "New" or "Revised" License
335 stars 51 forks source link

Runtime error #40

Closed farnoush-shh closed 3 years ago

farnoush-shh commented 3 years ago

Hi,

I am using scArches to project and integrate query datasets on the top of a reference, it is working well until training on reference dataset but training on query dataset gives me a Runtime error...

model = sca.models.SCANVI.load_query_data( query_adata, ref_path, freeze_dropout = True, ) model._unlabeled_indices = np.arange(query_adata.n_obs) model._labeled_indices = [] print("Labelled Indices: ", len(model._labeled_indices)) print("Unlabelled Indices: ", len(model._unlabeled_indices))

INFO Using data from adata.X
INFO Computing library size prior per batch
INFO Registered keys:['X', 'batch_indices', 'local_l_mean', 'local_l_var', 'labels']
INFO Successfully registered anndata object containing 1099 cells, 4102 vars, 28 batches, 88 labels, and 0 proteins. Also registered 0 extra categorical covariates and 0
extra continuous covariates.

RuntimeError Traceback (most recent call last)

in 2 query_adata, 3 ref_path, ----> 4 freeze_dropout = True, 5 ) 6 model._unlabeled_indices = np.arange(query_adata.n_obs) ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scvi/core/models/archesmixin.py in load_query_data(cls, adata, reference_model, inplace_subset_query_vars, use_cuda, unfrozen, freeze_dropout, freeze_expression, freeze_decoder_first_layer, freeze_batchnorm_encoder, freeze_batchnorm_decoder, freeze_classifier) 116 else: 117 dim_diff = new_ten.size()[-1] - load_ten.size()[-1] --> 118 fixed_ten = torch.cat([load_ten, new_ten[..., -dim_diff:]], dim=-1) 119 load_state_dict[key] = fixed_ten 120 RuntimeError: Sizes of tensors must match except in dimension 1. Got 77 and 88 in dimension 0 (The offending index is 1) while the query_adata has 73 labels (even if I select only 77 labels still the same error with 99 labels)... I'll be grateful for any help! Best,
Cottoneyejoe95 commented 3 years ago

Hi, it seems that you are using scANVI with a query dataset that has new cell types in it, that the reference dataset doesnt have. Therefore you have to preprocess the query dataset in the following way before you call 'load_query_data()':

query_adata.obs['orig_cell_types'] = query_adata.obs[cell_type_key].copy()
query_adata.obs[cell_type_key] = old_scanvi.unlabeled_category_

model = sca.models.SCANVI.load_query_data(
query_adata,
ref_path,
freeze_dropout = True,
)
print("Labelled Indices: ", len(model._labeled_indices))
print("Unlabelled Indices: ", len(model._unlabeled_indices))

as mentioned in this notebook: https://scarches.readthedocs.io/en/latest/scanvi_surgery_pipeline.html

farnoush-shh commented 3 years ago

Thanks Marco. I did it before but results were not satisfying (error was gone anyway)...I will try again

farnoush-shh commented 3 years ago

again back, this time I am using TRVAE,

trvae = sca.models.TRVAE( adata=reference_adata, condition_key=condition_key, conditions=reference_batch_labels, hidden_layer_sizes=[128,128], ) ​

INITIALIZING NEW NETWORK.............. Encoder Architecture: Input Layer in, out and cond: 4102 128 18 Hidden Layer 1 in/out: 128 128 Mean/Var Layer in/out: 128 10 Decoder Architecture: First Layer in, out and cond: 10 128 18 Hidden Layer 1 in/out: 128 128 Output Layer in/out: 128 4102

and the error which is arising for training,

trvae.train( n_epochs=trvae_epochs, alpha_epoch_anneal=200, early_stopping_kwargs=early_stopping_kwargs ) Trying to set attribute .obs of view, copying. Trying to set attribute .obs of view, copying.

ValueError Traceback (most recent call last)

in 2 n_epochs=trvae_epochs, 3 alpha_epoch_anneal=200, ----> 4 early_stopping_kwargs=early_stopping_kwargs 5 ) ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/models/trvae/trvae_model.py in train(self, n_epochs, lr, eps, **kwargs) 281 condition_key=self.condition_key_, 282 **kwargs) --> 283 self.trainer.train(n_epochs, lr, eps) 284 self.is_trained_ = True 285 ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/trainers/trvae/trainer.py in train(self, n_epochs, lr, eps) 166 167 # Loss Calculation --> 168 self.on_iteration(batch_data) 169 170 # Validation of Model, Monitoring, Early Stopping ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/trainers/trvae/trainer.py in on_iteration(self, batch_data) 243 244 # Calculate Loss depending on Trainer/Model --> 245 self.current_loss = loss = self.loss(**batch_data) 246 self.optimizer.zero_grad() 247 loss.backward() ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/trainers/trvae/unsupervised.py in loss(self, total_batch) 61 62 def loss(self, total_batch=None): ---> 63 recon_loss, kl_loss, mmd_loss = self.model(**total_batch) 64 loss = recon_loss + self.calc_alpha_coeff()*kl_loss + mmd_loss 65 self.iter_logs["loss"].append(loss) ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 720 result = self._slow_forward(*input, **kwargs) 721 else: --> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(), ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/models/trvae/trvae.py in forward(self, x, batch, sizefactor) 185 z1_mean, z1_log_var = self.encoder(x_log, batch) 186 z1 = self.sampling(z1_mean, z1_log_var) --> 187 outputs = self.decoder(z1, batch) 188 189 if self.recon_loss == "mse": ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 720 result = self._slow_forward(*input, **kwargs) 721 else: --> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(), ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/models/trvae/modules.py in forward(self, z, batch) 186 batch = one_hot_encoder(batch, n_cls=self.n_classes) 187 z_cat = torch.cat((z, batch), dim=-1) --> 188 dec_latent = self.FirstL(z_cat) 189 else: 190 dec_latent = self.FirstL(z) ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 720 result = self._slow_forward(*input, **kwargs) 721 else: --> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(), ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input) 115 def forward(self, input): 116 for module in self: --> 117 input = module(input) 118 return input 119 ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs) 720 result = self._slow_forward(*input, **kwargs) 721 else: --> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(), ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/models/trvae/modules.py in forward(self, x) 23 out = self.expr_L(x) 24 else: ---> 25 expr, cond = torch.split(x, x.shape[1] - self.n_cond, dim=1) 26 out = self.expr_L(expr) + self.cond_L(cond) 27 return out ValueError: too many values to unpack (expected 2) thanks for your help.. Best,
Cottoneyejoe95 commented 3 years ago

Hi, that one was a bit trickier. It was a bug with the torch.split() function when the number of batches is bigger than the latent dim. Good that you detected the bug! I hopefully fixed the bug and updated to the new version 0.3.3. So please also update your package installation and tell me if it works now.

Best,

farnoush-shh commented 3 years ago

Hi Marco,

great! now it is working..another issue that I have from the previous version (older than 0.3.0)..it is working very well until the number of labels (classes) are less than 20 but when I am using for more classes (almost 200 cell_types) , it will fail...dimension of latent space maybe?

farnoush-shh commented 3 years ago

training worked and now:

adata_latent = sc.AnnData(trvae.get_latent()) adata_latent.obs['cell_type'] = reference_adata.obs[cell_type_key].tolist() adata_latent.obs['Patient'] = reference_adata.obs[condition_key].tolist()

TypeError Traceback (most recent call last)

in ----> 1 adata_latent = sc.AnnData(trvae.get_latent()) 2 adata_latent.obs['cell_type'] = reference_adata.obs[cell_type_key].tolist() 3 adata_latent.obs['Patient'] = reference_adata.obs[condition_key].tolist() ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scarches/models/trvae/trvae_model.py in get_latent(self, x, c, mean) 322 c = torch.tensor(labels, device=device) 323 --> 324 x = torch.tensor(x, device=device) 325 326 latents = [] ~/anaconda3/envs/epigenf2/lib/python3.7/site-packages/scipy/sparse/base.py in __len__(self) 289 # non-zeros is more important. For now, raise an exception! 290 def __len__(self): --> 291 raise TypeError("sparse matrix length is ambiguous; use getnnz()" 292 " or shape[0]") 293 TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0] any idea? Best,
Cottoneyejoe95 commented 3 years ago

Yeah possible options would be to also increase latent dim, general architecture size, or even increase number of highly variable genes if possible.

Concerning your second error did you call remove_sparsity() function for adata before using it for trvae?

from scarches.dataset.trvae.data_handling import remove_sparsity
adata = remove_sparsity(adata)
farnoush-shh commented 3 years ago

Yes, it solved the problem..seems I forgot to run that line...many thanks.

M0hammadL commented 3 years ago

kho ie star bezan :)

farnoush-shh commented 3 years ago

kho ie star bezan :)

:) taze az dandoon pezeshki umadam ta hamin ja ham eftekhar amiz amal kardam :)

farnoush-shh commented 3 years ago

Hi again,

Maybe there is something that I am missing! no error but strange results.. Back to SCANVI model: I am trying to predict 77 labels, and predicted labels are 7 labels

Screenshot 2021-01-19 at 15 41 29 image

I checked the whole process and restricted my labels to 8 and it worked very well as I expected..I thought there must be some fixed parameters which yield to this result.

I will be so grateful for your help!

Update: Increasing the latent space dimension also did not work!

Cottoneyejoe95 commented 3 years ago

Hi, if you still have problems here, could you provide a print of your model architecture byprint(scanvi.model) ? Additionally how many genes are you using for this experiment?

farnoush-shh commented 3 years ago

Yes, I still have this problem and will update you.

farnoush-shh commented 3 years ago

Hi,

Anndata setup with scvi-tools version 0.8.1. Data Summary
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┓ ┃ Data ┃ Count ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━┩ │ Cells │ 95279 │ │ Vars │ 28713 │ │ Labels │ 105 │ │ Batches │ 18 │ │ Proteins │ 0 │ │ Extra Categorical Covariates │ 0 │ │ Extra Continuous Covariates │ 0 │ └──────────────────────────────┴───────┘ SCVI Data Registry
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Data ┃ scvi-tools Location ┃ ┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ X │ adata.X │ │ batch_indices │ adata.obs['_scvi_batch'] │ │ local_l_mean │ adata.obs['_scvi_local_l_mean'] │ │ local_l_var │ adata.obs['_scvi_local_l_var'] │ │ labels │ adata.obs['_scvi_labels'] │ └───────────────┴─────────────────────────────────┘ and the SCANVI.model:

SCANVAE( (z_encoder): Encoder( (encoder): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=28731, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 1): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 2): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 3): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) ) ) (mean_encoder): Linear(in_features=128, out_features=15, bias=True) (var_encoder): Linear(in_features=128, out_features=15, bias=True) ) (l_encoder): Encoder( (encoder): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=28731, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) ) ) (mean_encoder): Linear(in_features=128, out_features=1, bias=True) (var_encoder): Linear(in_features=128, out_features=1, bias=True) ) (decoder): DecoderSCVI( (px_decoder): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=33, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 1): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 2): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 3): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) ) ) (px_scale_decoder): Sequential( (0): Linear(in_features=128, out_features=28713, bias=True) (1): Softmax(dim=-1) ) (px_r_decoder): Linear(in_features=128, out_features=28713, bias=True) (px_dropout_decoder): Linear(in_features=128, out_features=28713, bias=True) ) (classifier): Classifier( (classifier): Sequential( (0): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=15, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 1): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 2): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 3): Sequential( (0): Linear(in_features=128, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) ) ) (1): Linear(in_features=128, out_features=105, bias=True) (2): Softmax(dim=-1) ) ) (encoder_z2_z1): Encoder( (encoder): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=120, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 1): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 2): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) (Layer 3): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): Dropout(p=0.1, inplace=False) ) ) ) (mean_encoder): Linear(in_features=128, out_features=15, bias=True) (var_encoder): Linear(in_features=128, out_features=15, bias=True) ) (decoder_z1_z2): Decoder( (decoder): FCLayers( (fc_layers): Sequential( (Layer 0): Sequential( (0): Linear(in_features=120, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 1): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 2): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) (Layer 3): Sequential( (0): Linear(in_features=233, out_features=128, bias=True) (1): None (2): LayerNorm((128,), eps=1e-05, elementwise_affine=False) (3): ReLU() (4): None ) ) ) (mean_decoder): Linear(in_features=128, out_features=15, bias=True) (var_decoder): Linear(in_features=128, out_features=15, bias=True) ) )

just could predict 2 labels among 105:

reference_latent.obs.predictions.unique() array(['PC1', 'vCM1.0'], dtype=object).

Ps: what is your strategy for imbalance datasets?

Many thanks.

Cottoneyejoe95 commented 3 years ago

Okay, first of all I would strongly suggest that you preprocess your data by filtering highly variable genes, as described in this notebook: https://scarches.readthedocs.io/en/latest/reference_building_from_scratch.html

You can test with 2000 and with 4000 genes.

It seems that you added another layer in the network, I would suggest to firstly use the 2 hidden layers, as we are proposing it as default. And just make the latent representation dimension higher. So for now maybe first try it with the preprocessed dataset and standard architecture. If that doesnt work expand the latent dim to 20 or 30. If that doesnt work additionally add 3rd hidden layer.

farnoush-shh commented 3 years ago

Thanks for your suggestions. I used highly variable genes before with 4107 genes and 45000 cells and tried all your suggestions but instead of 2 labels I could produce 10 or 12 Labels. It is working well when I do subclustering and having labels up to 12 but this is not our desire. Anyway, I will try again and in case I have problems, will back for discussion. But still one question remained about imbalance dataset? do you have such option like "Focal loss" ? in case of having imbalance dataset, the accuracy is not a good option to evaluate the performance I guess. and many thank for your quick response.

Cottoneyejoe95 commented 3 years ago

Since this question goes really into detail of the scANVI base functionality and behavior and does not necessarily have to do with architecture surgery I would send you to the creators of scANVI. Maybe you should post your question in their issues section: https://github.com/YosefLab/scvi-tools

farnoush-shh commented 3 years ago

Thanks Marco. I will write them.