HKU-BAL / ClairS-TO

ClairS-TO - a deep-learning method for tumor-only somatic variant calling
BSD 3-Clause "New" or "Revised" License
37 stars 3 forks source link

Indexing error at phasing step #12

Open areebapatel opened 1 month ago

areebapatel commented 1 month ago

Hi,

Firstly, many thanks for this incredibly useful tool.

I am running ClairS-TO with the singularity container on a ONT WGS run. The library was basecalled and aligned using dorado 0.6.2.

Command used to run

singularity exec -B ${input_dir},${ref_dir},${output_dir} --bind=/local:/local:rw /b06x-isilon/b06x-m/mnp_nanopore/software/clairs-to_latest.sif \
          /opt/bin/run_clairs_to \
          --tumor_bam_fn ${input_dir}/${id}.hg38.bam \
          --sample_name ${id} \
          --ref_fn ${ref_dir}/hg38.fa \
          --threads ${params.threads} \
          --platform ${params.platform} \
          --output_dir ${output_dir} \
          --conda_prefix /opt/micromamba/envs/clairs-to \

All the previous steps run fine and then it runs into this error at the phasing step:

[INFO] Phase the Tumor BAM
[INFO] RUN THE FOLLOWING COMMAND:
( parallel --joblog /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/logs/phasing_log/parallel_2_phase_tumor.log -j 8 /opt/micromamba/envs/clairs-to/bin/longphase phase  -s /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/vcf/{1}.vcf -b /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/bam/392718.calls.mods.sorted.hg38.bam -r /b06x-isilon/b06x-m/mnp_nanopore/software/hg38/hg38.fa -t 8 -o /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1} --ont :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS && parallel -j 8 bgzip -f /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1}.vcf :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS ) 2>&1 | tee /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/logs/phasing_log/2_phase_tumor.log && parallel -j 8 tabix -f -p vcf /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1}.vcf.gz :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS

parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value
parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value
parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value

The full log file is here- run_clairs_to.log

Could you please help me with this?

Many thanks, Areeba.

JasonCLEI commented 1 month ago

Hi, @areebapatel,

We have found another error message[ERROR] Failed to load reference sequence from file (/b06x-isilon/b06x-m/mnp_nanopore/software/hg38/hg38.fa). from your running log. Along with the error message pos 0 missing GT value, we believed that the program is not running normally, which could occur issues while reading and writing files. We suggest that you rerun ClairS-TO separately to see if the same issue still exists. If you have done so and still got the error, could you kindly provide your latest running log of run_clairs_to.log file and some VCF files stored in /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/vcf/ like chr1.vcf? And we will check the issue for you further.

Lei

areebapatel commented 3 weeks ago

Hi Lei,

Thanks for your looking into this. I have checked that the reference file exists and is used successfully with other tools. Here are the VCF files for the run. chr19.txt chrY.txt

I am not sure I understand what you mean by running ClairS-TO separately. How should I do that?

Areeba.

JasonCLEI commented 2 weeks ago

Hi, @areebapatel,

We suggest that you run ClairS-TO again with the same command to see if the issue still exists. If you have done so and still got the error, we will check the issue for you further.

Lei