KChen-lab / Monopogen

SNV calling from single cell sequencing
GNU General Public License v3.0
68 stars 16 forks source link

Error during LDrefinement #61

Open Thapeachydude opened 2 months ago

Thapeachydude commented 2 months ago

Hi,

I'm encountering an error during the final step of the SNV calling tutorial for scRNA-Seq data. Specifically, I get:

[2024-05-21 17:09:53,243] INFO     Monopogen.py Run LD refinement ...
Rscript /cluster/work/moor/marcel_prj/Monopogen/apps/../src/LDrefinement.R  /cluster/work/moor/loop_prj_analysis/monopogen/bam/somatic/chr11.gl.filter.hc.cell.mat.gz /cluster/work/moor/loop_prj_analysis/monopogen/bam/somatic/ chr11
Error in train_x_neg[, colnames(train_x_neg) == "QS"] :
  incorrect number of dimensions
Calls: SVM_train
Execution halted
[2024-05-21 17:10:23,988] ERROR    Monopogen.py In LDrefinement step chr11 failed!
[2024-05-21 17:10:23,989] ERROR    Monopogen.py Failed! See instructions above.

Any insights into what is causing this would be much appreciated.

Many thanks and best regards : )

ValerieMarot commented 2 months ago

Hi,

I'm encountering a very similar error during the last step on my own data:

[2024-05-22 09:35:55,483] INFO     Monopogen.py Run LD refinement ...
Rscript apps/../src/LDrefinement.R  run_data/A10//somatic/chr21.gl.filter.hc.cell.mat.gz run_data/A10//somatic/ chr21
Error in test_x[, colnames(test_x) == "QS"] : 
  incorrect number of dimensions
Calls: SVM_train
Execution halted
[2024-05-22 09:35:57,114] ERROR    Monopogen.py In LDrefinement step chr21 failed!
[2024-05-22 09:35:57,114] ERROR    Monopogen.py Failed! See instructions above.

Any help would also be very much appreciated! And thank you for developing and maintaining such a nice method :)

jinzhuangdou commented 1 month ago

Hi, could you please share the file named chr*.gl.vcf.filter.DP4 with me for troubleshooting purposes? Thank you!

Thapeachydude commented 1 month ago

Hi,

the first few lines look like this:

chr11   207185  G       A       1       0.0     0.0     1.0     0.0     NA      None    0.0     None    None    None    None    -0.38   0.0
chr11   208001  C       T       1       0.0     0.0     0.0     1.0     NA      None    0.0     None    None    None    None    -0.38   0.0
chr11   288042  C       A       1       0.0     0.0     0.0     1.0     NA      None    0.0     None    None    None    None    -0.38   0.0
chr11   288115  T       C       1       0.0     0.0     0.0     1.0     0|1     None    0.0     None    None    None    None    -0.38   0.0
chr11   308290  T       C       2       0.0     0.0     0.0     2.0     1|0     0.1     0.0     None    None    None    None    -0.45   0.0

Would you have an e-mail address I can send the full file to?

jinzhuangdou commented 1 month ago

Could you try the updated version. I have just fixed this issue yesterday since some features used for SVM trainning are not available in your files. Let me know if you still have problems. Thanks

Thapeachydude commented 1 month ago

I pulled the latest version and re-ran everything. LDrefinement still fails. chr*.gl.vcf.filter.DP4 now looks like this.

chr11   207185  G   A   1   0.0 0.0 1.0 0.0 NA  None    0.0 None    None    None    None    -0.38   0.0
chr11   208001  C   T   1   0.0 0.0 0.0 1.0 NA  None    0.0 None    None    None    None    -0.38   0.0
chr11   288042  C   A   1   0.0 0.0 0.0 1.0 NA  None    0.0 None    None    None    None    -0.38   0.0
chr11   288115  T   C   1   0.0 0.0 0.0 1.0 1|0 None    0.0 None    None    None    None    -0.38   0.0
chr11   308290  T   C   2   0.0 0.0 0.0 2.0 0|1 0.1 0.0 None    None    None    None    -0.45   0.0
chr11   308363  G   C   2   0.0 1.0 0.0 1.0 0|1 None    0.5 1.0 1.0 1.0 None    -0.38   0.0
chr11   309069  C   T   4   0.0 3.0 0.0 1.0 NA  None    0.75    1.0 1.0 1.0 None    -0.38   0.0
chr11   309127  A   G   3   0.0 1.0 0.0 2.0 0|1 0.8 0.33    1.0 1.0 1.0 None    -0.45   0.0
chr11   314969  T   G   5   0.0 4.0 0.0 1.0 NA  None    0.8 1.0 1.0 1.0 None    -0.38   0.0
chr11   320649  G   A   4   1.0 0.0 3.0 0.0 NA  0.02    0.25    1.0 1.0 1.0 None    -0.51   0.0
jinzhuangdou commented 1 month ago

Is it possible that you can share one chr*.gl.vcf.filter.DP4 with me so that I can give a trouble shooting? Thanks

Thapeachydude commented 1 month ago

Hi, you can get the file for chr11 here.

aidanshoham12 commented 1 month ago

Hello, Ive been having the same issue in the LD refinement module, here is the error message message im getting: python ${path}/src/Monopogen.py somatic -a ${path}/apps -r ${path}/lab_samples/region_lst/region.lst -t 1 -i ${path}/lab_samples/outputs/sample_name -l ${path}/lab_samples/csv_files/csv_somatic.csv -s LDrefinement -g ${path}/example/hg38.fa [2024-06-13 16:08:42,935] INFO Monopogen.py Run LD refinement ... Rscript /path/to/src/LDrefinement.R /path/to/somatic/chr20.gl.filter.hc.cell.mat.gz /path/to/somatic/ chr20 sh: Rscript: command not found [2024-06-13 16:08:43,051] ERROR Monopogen.py In LDrefinement step chr20 failed! [2024-06-13 16:08:43,051] ERROR Monopogen.py Failed! See instructions above. For reference I pulled the code from GitHub on June 6. For reference, here is what happens when I head chr*.gl.vcf.filter.DP4. chr20 251727 C G 123 0.0 0.0 0.0 4.0 1|0 0.63 0.0 None None None None -0.56 0.0 chr20 258018 C T 77 0.0 0.0 5.0 70.0 1|1 0.07 0.0 None None None 1.0 -0.69 0.0 Let me know if you have any suggestions to get it to work. Thank you so much!

Vann6 commented 3 weeks ago

I encountered an same issue with LDrefinement.R. In LDrefinement.R, when running mutation_block <- SNV_block(summary=meta) and svm_in <- SVM_prepare(mutation_block), either svm_in$test or svm_in$train is empty. Consequently, svm_out <- SVM_train(label = svm_in, dir = outdir, region = region) throws an error: Error in test_x[, colnames(test_x) == "QS"] : incorrect number of dimensions. I've verified that the input files are correct. Can you refine SUM-train and LDrefinement.R to handle svm_in$test or svm_in$train is empty. Below are my tested files. Thank you! chr4.gl.filter.hc.cell.mat.gz chr3.gl.filter.hc.cell.mat.gz