mat1 and mat2 shapes cannot be multiplied

Karobben commented 1 year ago

Dear author, I met an error like below. Is there any parameter that could by pass this error?

Thanks

Data Loaded with the Following Configurations:
Source data: rna    Preprocess: Standard    Shape [115941, 2000]
Target data: atac   Preprocess: TFIDF   Shape [36145, 32376]
======= Training Start =======
Traceback (most recent call last):
  File "/mnt/Data/PopOS/Data_Ana/XinLi/../../NGS/scBridge/main.py", line 137, in <module>
    main(args)
  File "/mnt/Data/PopOS/Data_Ana/XinLi/../../NGS/scBridge/main.py", line 42, in main
    preds, prob_feat, prob_logit = net.run(
  File "/mnt/Data/PopOS/NGS/scBridge/model_utils.py", line 63, in run
    target_h = self.encoder(target_x)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/torch/nn/modules/container.py", line 204, in forward
    input = module(input)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x32376 and 2000x256)

Karobben commented 1 year ago

So, I figured out it but got a new error:

======= Training Start =======
Traceback (most recent call last):
  File "/mnt/Data/PopOS/Data_Ana/XinLi/../../NGS/scBridge/main.py", line 137, in <module>
    main(args)
  File "/mnt/Data/PopOS/Data_Ana/XinLi/../../NGS/scBridge/main.py", line 42, in main
    preds, prob_feat, prob_logit = net.run(
  File "/mnt/Data/PopOS/NGS/scBridge/model_utils.py", line 112, in run
    similarity, preds = feature_prototype_similarity(
  File "/mnt/Data/PopOS/NGS/scBridge/model_utils.py", line 197, in feature_prototype_similarity
    similarity = cosine_similarity(target_feature, source_prototypes)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/sklearn/metrics/pairwise.py", line 1377, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/sklearn/metrics/pairwise.py", line 155, in check_pairwise_arrays
    X = check_array(
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/sklearn/utils/validation.py", line 899, in check_array
    _assert_all_finite(
  File "/mnt/Data/PopOS/miniconda/envs/scBridge/lib/python3.10/site-packages/sklearn/utils/validation.py", line 146, in _assert_all_finite
    raise ValueError(msg_err)
ValueError: Input contains NaN.

YH-Zheng commented 1 year ago

@Yunfan-Li I've encountered the same issue. I wonder if there is a solution now?

RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x606219 and 36326x256)

Yunfan-Li commented 1 year ago

@Karobben @YH-Zheng Sorry for the late reply. scBridge accepts the gene expression matrix of scRNA-seq data and the gene activity matrix of scATAC-seq data as the inputs. Common genes need to be selected before feeding into the model.

Yunfan-Li commented 1 year ago

similarity = cosine_similarity(target_feature, source_prototypes)

@Karobben Hi, could you check it is the target_feature or source_prototypes contains NaN?

YH-Zheng commented 1 year ago

@Karobben @YH-Zheng Sorry for the late reply. scBridge accepts the gene expression matrix of scRNA-seq data and the gene activity matrix of scATAC-seq data as the inputs. Common genes need to be selected before feeding into the model.

@Yunfan-Li What does common gene mean? I didn't find any tutorials to prompt me to do this step. How should I perform common gene selection?

Yunfan-Li commented 1 year ago

@YH-Zheng If you currently have the peak matrix, you need first to transform it into the activity matrix using packages such as Signac. After that, common gene selection could be done by subsampling the scRNA-seq gene count matrix and scATAC-seq gene activity matrix to have the same set of genes.

YH-Zheng commented 1 year ago

@Yunfan-Li I roughly understand, can you add this step to the tutorial? In the current tutorial, this step seems to be vague, and it does not explain that the scRNA data requires a count matrix (it seems to be re-normalized)