CaibinSh / scAR-reproducibility

Scripts to reproduce the results of scAR manuscript 'Probabilistic modeling of ambient noise in single-cell omics data'
https://doi.org/10.1101/2022.01.14.476312
1 stars 1 forks source link

ValueError on training #6

Closed mdmanurung closed 2 years ago

mdmanurung commented 2 years ago

Dear Caibin,

I encountered an error upon calling the train() method. Below is the abridged

scar_obj.train(epochs=400, batch_size=256,)

..Running VAE using the following param set:
......scAR mode:  scRNAseq
......count model:  binomial
......num_input_feature:  15604
......NN_layer1:  150
......NN_layer2:  100
......latent_space:  15
......dropout_prob:  0
......kld_weight:  1e-05
......lr:  0.001
......lr_step_size:  5
......lr_gamma:  0.97
===========================================
  Training.....
  0%|          | 0/400 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [12], in <module>
----> 1 scar_obj.train(epochs=400, batch_size=256,)

File /scAR/lib/python3.8/site-packages/torch/distributions/distribution.py:55, in Distribution.__init__(self, batch_shape, event_shape, validate_args)
     53 valid = constraint.check(value)
     54 if not valid.all():
---> 55     raise ValueError(
     56         f"Expected parameter {param} "
     57         f"({type(value).__name__} of shape {tuple(value.shape)}) "
     58         f"of distribution {repr(self)} "
     59         f"to satisfy the constraint {repr(constraint)}, "
     60         f"but found invalid values:\n{value}"
     61     )
     62 if not constraint.check(getattr(self, param)).all():
     63     raise ValueError("The parameter {} has invalid values".format(param))

ValueError: Expected parameter loc (Tensor of shape (256, 15)) of distribution Normal(loc: torch.Size([256, 15]), scale: torch.Size([256, 15])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0',
       grad_fn=<AddmmBackward0>)

I have also tried running scAR on my CITEseq data and at least so far the model is training. Perhaps there's something wrong with my RNA count matrix, which I feel quite unlikely.

I am happy to provide further information should you need it. Thanks in advance.

Regards, Mikhael

CaibinSh commented 2 years ago

Hi Mikhael,

Could you give a quick check if a different mode ('poisson' rather than 'binomial') runs through with your data? If yes, I would probably know where I should modify;

scarObj = scAR.model(raw_count=raw_counts, empty_profile=ambient_profile, scRNAseq_tech='scRNAseq', model='poisson' )

If this does not solve your problem, please give a try with a lower learning rate by:

scar_obj.train(epochs=400, batch_size=256, lr=1e-4), for example.

Best, Caibin

CaibinSh commented 2 years ago

Hi @mdmanurung,

I am wondering whether the issue stays or not. An additional line has been added in the code potentially for this issue. If all the tries still fail to fix the issue, would you mind to share the RNA matrix with me through my email (caibin.sheng@novartis.com).

Many thanks, Caibin

Dear Caibin,

I encountered an error upon calling the train() method. Below is the abridged

scar_obj.train(epochs=400, batch_size=256,)

..Running VAE using the following param set:
......scAR mode:  scRNAseq
......count model:  binomial
......num_input_feature:  15604
......NN_layer1:  150
......NN_layer2:  100
......latent_space:  15
......dropout_prob:  0
......kld_weight:  1e-05
......lr:  0.001
......lr_step_size:  5
......lr_gamma:  0.97
===========================================
  Training.....
  0%|          | 0/400 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [12], in <module>
----> 1 scar_obj.train(epochs=400, batch_size=256,)

File /scAR/lib/python3.8/site-packages/torch/distributions/distribution.py:55, in Distribution.__init__(self, batch_shape, event_shape, validate_args)
     53 valid = constraint.check(value)
     54 if not valid.all():
---> 55     raise ValueError(
     56         f"Expected parameter {param} "
     57         f"({type(value).__name__} of shape {tuple(value.shape)}) "
     58         f"of distribution {repr(self)} "
     59         f"to satisfy the constraint {repr(constraint)}, "
     60         f"but found invalid values:\n{value}"
     61     )
     62 if not constraint.check(getattr(self, param)).all():
     63     raise ValueError("The parameter {} has invalid values".format(param))

ValueError: Expected parameter loc (Tensor of shape (256, 15)) of distribution Normal(loc: torch.Size([256, 15]), scale: torch.Size([256, 15])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0',
       grad_fn=<AddmmBackward0>)

I have also tried running scAR on my CITEseq data and at least so far the model is training. Perhaps there's something wrong with my RNA count matrix, which I feel quite unlikely.

I am happy to provide further information should you need it. Thanks in advance.

Regards, Mikhael

mdmanurung commented 2 years ago

Hi Caibin,

I haven't got the time to try it yet. I will update the installation and try this out. Thanks!

Regards, Mikhael

mdmanurung commented 2 years ago

Hi Caibin,

The function works now. So, for scRNA-Seq data, should I use binomial or poisson model? And for the inference() call, should I use the same model as for training?

Just for my curiosity--if you don't mind explaining--what caused the previous error?

Regards, Mikhael

CaibinSh commented 2 years ago

Hi Mikhael,

Great that it works for you now. It does not make a big difference in my case with 'binomial' or 'poisson'. I would recommend using the default one.

During inference, the model does not matter to scRNAseq or CITEseq mode. The model relates to the Bayesfactor output, which is designed for sgRNA/identity-barcode assignment, which is to say, it only matters when you analyse single-cell CRISPR screening data -- it uses this model to test whether native signal is present. We recommend the default setting -- 'poisson' model.

Hi Caibin,

The function works now. So, for scRNA-Seq data, should I use binomial or poisson model? And for the inference() call, should I use the same model as for training?

My bold guess is either your input contains NaN or it generates NaN during training. I have added lines to deal with potential NaNs.

Just for my curiosity--if you don't mind explaining--what caused the previous error?

Regards, Mikhael

Thanks very much for your valuable feedback. Best, Caibin