bm2-lab / scMVP

MIT License
28 stars 11 forks source link

ValueError in full.sequential().get_latent() run #8

Open PaulineMoulle opened 2 years ago

PaulineMoulle commented 2 years ago

Hi, thank you for your work first. I'm trying to run the model on another scRNA+scATAC-seq dataset following the steps in the 10x_pbmc_demo.

First steps correctly running .

trainer.model.eval()

Multi_VAE_Attention(
  (RNA_encoder): Encoder_nb_attention(
    (encoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=32285, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (px_decoder_aux): Sequential(
      (0): Linear(in_features=32285, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=128, bias=True)
      (2): Sigmoid()
    )
    (mean_encoder): Linear(in_features=128, out_features=20, bias=True)
    (var_encoder): Linear(in_features=128, out_features=20, bias=True)
  )
  (ATAC_encoder): Encoder_nb_selfattention(
    (encoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=141480, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (px_encoder_aux): Sequential(
      (0): Linear(in_features=141480, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=128, bias=True)
      (2): Sigmoid()
    )
    (w_q): Linear(in_features=128, out_features=128, bias=True)
    (w_k): Linear(in_features=128, out_features=128, bias=True)
    (w_v): Linear(in_features=128, out_features=128, bias=True)
    (do): Dropout(p=0.1, inplace=False)
    (layernorm): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
    (mean_encoder): Linear(in_features=128, out_features=20, bias=True)
    (var_encoder): Linear(in_features=128, out_features=20, bias=True)
  )
  (concatenter): Linear(in_features=40, out_features=20, bias=True)
  (l_encoder): Encoder_l(
    (encoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=32285, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): None
          (4): ReLU()
          (5): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (mean_encoder): Linear(in_features=128, out_features=1, bias=True)
    (var_encoder): Linear(in_features=128, out_features=1, bias=True)
  )
  (RNA_ATAC_encoder): Multi_Encoder_nb_SelfAttention(
    (scRNA_encoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=32285, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (scATAC_encoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=141480, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (RNA_encoder_aux): Sequential(
      (0): Linear(in_features=32285, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=128, bias=True)
      (2): Sigmoid()
    )
    (w_q): Linear(in_features=128, out_features=128, bias=True)
    (w_k): Linear(in_features=128, out_features=128, bias=True)
    (w_v): Linear(in_features=128, out_features=128, bias=True)
    (do): Dropout(p=0.1, inplace=False)
    (layernorm): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
    (concat): Linear(in_features=256, out_features=128, bias=True)
    (mean_encoder): Linear(in_features=128, out_features=20, bias=True)
    (var_encoder): Linear(in_features=128, out_features=20, bias=True)
  )
  (RNA_ATAC_decoder): Multi_Decoder_nb_SelfAttention(
    (scRNA_decoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=20, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): None
        )
      )
    )
    (rna_scale_decoder): Sequential(
      (0): Linear(in_features=128, out_features=256, bias=True)
      (1): Linear(in_features=256, out_features=32285, bias=True)
      (2): Softmax(dim=-1)
    )
    (rna_r_decoder): Linear(in_features=128, out_features=32285, bias=True)
    (rna_dropout_decoder): Linear(in_features=128, out_features=32285, bias=True)
    (px_rna_decoder_aux): Sequential(
      (0): Linear(in_features=20, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=32285, bias=True)
      (2): Sigmoid()
    )
    (scATAC_decoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=20, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): None
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): None
        )
      )
    )
    (cluster_decoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=8, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): LeakyReLU(negative_slope=0.01)
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): None
        )
      )
    )
    (atac_scale_decoder): Sequential(
      (0): Linear(in_features=128, out_features=512, bias=True)
      (1): Linear(in_features=512, out_features=141480, bias=True)
      (2): Sigmoid()
    )
    (w_q): Linear(in_features=128, out_features=128, bias=True)
    (w_k): Linear(in_features=128, out_features=128, bias=True)
    (w_v): Linear(in_features=128, out_features=128, bias=True)
    (do): Dropout(p=0.01, inplace=False)
    (px_atac_decoder_aux): Sequential(
      (0): Linear(in_features=20, out_features=128, bias=True)
      (1): Linear(in_features=128, out_features=141480, bias=True)
      (2): Softmax(dim=-1)
    )
    (atac_r_decoder): Linear(in_features=128, out_features=141480, bias=True)
    (atac_dropout_decoder): Linear(in_features=128, out_features=141480, bias=True)
    (libaray_decoder): FCLayers(
      (fc_layers): Sequential(
        (Layer 0): Sequential(
          (0): Linear(in_features=20, out_features=128, bias=True)
          (1): LayerNorm((128,), eps=0.0001, elementwise_affine=True)
          (2): LeakyReLU(negative_slope=0.01)
          (3): BatchNorm1d(128, eps=0.0001, momentum=0.01, affine=True, track_running_stats=True)
          (4): LeakyReLU(negative_slope=0.01)
          (5): None
        )
      )
    )
    (libaray_rna_scale_decoder): Sequential(
      (0): Linear(in_features=128, out_features=1, bias=True)
    )
    (libaray_atac_scale_decoder): Sequential(
      (0): Linear(in_features=128, out_features=1, bias=True)
    )
  )
)

But i have this error for this command

latent, latent_rna, latent_atac, cluster_gamma, cluster_index, batch_indices, labels = full.sequential().get_latent()

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_26015/3007406925.py in <module>
----> 1 latent, latent_rna, latent_atac, cluster_gamma, cluster_index, batch_indices, labels = full.sequential().get_latent()

~/prog_bio/anaconda3/envs/MYVENV/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     25         def decorate_context(*args, **kwargs):
     26             with self.clone():
---> 27                 return func(*args, **kwargs)
     28         return cast(F, decorate_context)
     29 

~/prog_bio/scMVP/scMVP/inference/multi_inference.py in get_latent(self, sample)
    279             give_mean = not sample
    280             latent_temp = self.model.sample_from_posterior_z(
--> 281                 [sample_batch_rna, sample_batch_atac], y=label, give_mean=give_mean
    282             )
    283             latent += [

~/prog_bio/scMVP/scMVP/models/multi_vae_attention.py in sample_from_posterior_z(self, x, y, give_mean)
    226         qz_rna_m, qz_rna_v, rna_z = self.RNA_encoder(x[0], None)
    227         qz_atac_m, qz_atac_v, atac_z = self.ATAC_encoder(x[1], None)
--> 228         qz_m, qz_v, z = self.RNA_ATAC_encoder(x, None)
    229         if give_mean:
    230             z = qz_m,

~/prog_bio/anaconda3/envs/MYVENV/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

~/prog_bio/scMVP/scMVP/models/modules.py in forward(self, x, *cat_list)
    904         q_m = self.mean_encoder(q)
    905         q_v = torch.exp(self.var_encoder(q)) + 1e-4
--> 906         latent = reparameterize_gaussian(q_m, q_v)
    907         return q_m, q_v, latent
    908 

~/prog_bio/scMVP/scMVP/models/modules.py in reparameterize_gaussian(mu, var)
     11 
     12 def reparameterize_gaussian(mu, var):
---> 13     return Normal(mu, var.sqrt()).rsample()
     14 
     15 

~/prog_bio/anaconda3/envs/MYVENV/lib/python3.7/site-packages/torch/distributions/normal.py in __init__(self, loc, scale, validate_args)
     48         else:
     49             batch_shape = self.loc.size()
---> 50         super(Normal, self).__init__(batch_shape, validate_args=validate_args)
     51 
     52     def expand(self, batch_shape, _instance=None):

~/prog_bio/anaconda3/envs/MYVENV/lib/python3.7/site-packages/torch/distributions/distribution.py in __init__(self, batch_shape, event_shape, validate_args)
     54                 if not valid.all():
     55                     raise ValueError(
---> 56                         f"Expected parameter {param} "
     57                         f"({type(value).__name__} of shape {tuple(value.shape)}) "
     58                         f"of distribution {repr(self)} "

ValueError: Expected parameter loc (Tensor of shape (64, 20)) of distribution Normal(loc: torch.Size([64, 20]), scale: torch.Size([64, 20])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]])

I would greatly appreciate any advice on how to handle this - many thanks in advance!

adamtongji commented 2 years ago

Hi , The step full.sequential().get_latent() is getting the latent layer after the training for joint-seq dataset.

I am not sure whether the step of rna+atac dataset loading, and the step of training is correct? The final ValueError report might indicate missing of one dimension(rna or atac) in the anndata format parameter dataset.

Can you check the input step?

LiuJJ0327 commented 2 years ago

Hi, I got the similar ValueError when following the scripts in 10x_pbmc_demo.ipynb. An error happens in trainer.train(n_epochs=15, lr=lr). Do you have any idea to fix it?

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/jliu25/scMVP/scMVP/inference/trainer.py", line 159, in train
    loss = self.loss(*tensors_list)
  File "/data/jliu25/scMVP/scMVP/inference/multi_inference.py", line 515, in loss
    sample_batch_X, sample_batch_Y, local_l_mean, local_l_var, batch_index, batch_index
  File "/data/jliu25/anaconda3/envs/py37/lib/python3.7/site-packages/torch-1.12.0-py3.7-linux-x86_64.egg/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/jliu25/scMVP/scMVP/models/multi_vae_attention.py", line 508, in forward
    outputs = self.inference(x, batch_index, y, local_l_mean, local_l_var, update=False)
  File "/data/jliu25/scMVP/scMVP/models/multi_vae_attention.py", line 389, in inference
    qz_m, qz_v, z = self.RNA_ATAC_encoder([x_rna, x_atac], batch_index)
  File "/data/jliu25/anaconda3/envs/py37/lib/python3.7/site-packages/torch-1.12.0-py3.7-linux-x86_64.egg/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data/jliu25/scMVP/scMVP/models/modules.py", line 906, in forward
    latent = reparameterize_gaussian(q_m, q_v)
  File "/data/jliu25/scMVP/scMVP/models/modules.py", line 13, in reparameterize_gaussian
    return Normal(mu, var.sqrt()).rsample()
  File "/data/jliu25/anaconda3/envs/py37/lib/python3.7/site-packages/torch-1.12.0-py3.7-linux-x86_64.egg/torch/distributions/normal.py", line 54, in __init__
    super(Normal, self).__init__(batch_shape, validate_args=validate_args)
  File "/data/jliu25/anaconda3/envs/py37/lib/python3.7/site-packages/torch-1.12.0-py3.7-linux-x86_64.egg/torch/distributions/distribution.py", line 56, in __init__
    f"Expected parameter {param} "
ValueError: Expected parameter loc (Tensor of shape (64, 20)) of distribution Normal(loc: torch.Size([64, 20]), scale: torch.Size([64, 20])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0',
       grad_fn=<AddmmBackward0>)
adamtongji commented 2 years ago

@LiuJJ0327 Hi, Thank you for your report! I rerun the 10x pbmc demo on CUDA10 and CUDA11 server, but could not reproduce your the error.

Could you first check the input file by dataset.atac_expression for atac and dataset.X for rna?

Also, the 10x pbmc demo requires ~30G memory and 2G GPU on our server. Does the task exceed the resource limit on your computer?