HKU-BAL / ClairS-TO

ClairS-TO - a deep-learning method for tumor-only somatic variant calling
BSD 3-Clause "New" or "Revised" License
37 stars 3 forks source link

Value Error in Step 2-3 CALL_VARIANTS #8

Closed oscargamarra closed 2 months ago

oscargamarra commented 2 months ago

Hi, I am using ClairS-TO to call SNVs in tumor data (Ilmn). In step 2-3 (pileup Model Calling Variants), roughly 6 hours into the job, I run into this problem:

[INFO] Pileup Model Calling Variants [INFO] RUN THE FOLLOWING COMMAND: ( parallel --joblog /hpc/pmc_holstege/rstudio/oscar_nanopore/snv_clairsto/SNV_called/logs/parallel_2-3_call_variants.log -j 24 python3 /opt/bin/clairs_to.py call_variants --predict_fn /hpc/pmc_holstege/rstudio/oscar_nanopore/snv_clairsto/SNV_called/tmp/predict/{1/} --call_fn /hpc/pmc_holstege/rstudio/oscar_nanopore/snv_clairsto/SNV_called/tmp/vcfoutput/p{1/}.vcf --platform ilmn --likelihood_matrix_data /opt/micromamba/envs/clairs-to/bin/clairs-to_models/ilmn/likelihood_matrix.txt :::: /hpc/pmc_holstege/rstudio/oscar_nanopore/snv_clairsto/SNV_called/tmp/candidates/CANDIDATES_FILES ) 2>&1 | tee /hpc/pmc_holstege/rstudio/oscar_nanopore/snv_clairsto/SNV_called/logs/2-3_CALL_VARIANTS.log

[INFO] Calling tumor-only somatic variants ... [INFO] Total time elapsed: 0.04 s [INFO] Calling tumor-only somatic variants ... Traceback (most recent call last): File "/opt/bin/clairs_to.py", line 107, in main() File "/opt/bin/clairs_to.py", line 101, in main submodule.main() File "/opt/bin/clairs/call_variants.py", line 646, in main call_variants_from_probability(args) File "/opt/bin/clairs/call_variants.py", line 567, in call_variants_from_probability output_vcf_from_probability( File "/opt/bin/clairs/call_variants.py", line 234, in output_vcf_from_probability best_match_alt_list, tumor_supported_reads_count_list = rank_variant_alt( ValueError: too many values to unpack (expected 2)

The traceback is repeated hundreds of times. Step 3 seems to work fine even with this issue, but the vcf file doesn't contain any SNVs in the end.

do you have any suggestions on what might be the cause? Thanks

JasonCLEI commented 2 months ago

Hi, @oscargamarra,

Thanks a lot for your interest. It seems that the error was caused by the wrong input pileup tensor file for calling. We have already released the new version (v0.1.0) of ClairS-TO. You could rerun the program with your Illumina data with the latest version of ClairS-TO and check if the issue still exists. If so, could you please provide your complete running log of run_clairs_to.log file with me (if possible, send it to my email address lchen@cs.hku.hk), and I will check the issue for you.

Lei

oscargamarra commented 2 months ago

Hi @JasonCLEI , Thank you so much for your fast reply! I am currently rerunning the program with the latest version, but I think it's also important to mention that I am using Ilmn RNA-seq. I read in one of the other issues that clairs/clairs-to does not work with RNA. I'll give you an update on how everything works anyway.

Oscar

JasonCLEI commented 2 months ago

Hi, @oscargamarra,

It should be the cause of the error. Currently, ClairS-TO is trained on DNA data and could be used to process DNA data on different platforms. Even if we can adapt the input of RNA data in ClairS-TO, we cannot guarantee its effectiveness. By the way, there have been some studies on Nanopore RNA-seq variant caller in our lab, and you are welcome to pay attention to our follow-up research.

Lei