ClairS-TO - a deep-learning method for tumor-only somatic variant calling
BSD 3-Clause "New" or "Revised" License
Indexing error at phasing step #12

Firstly, many thanks for this incredibly useful tool.

I am running ClairS-TO with the singularity container on a ONT WGS run. The library was basecalled and aligned using dorado 0.6.2.

Command used to run

singularity exec -B ${input_dir},${ref_dir},${output_dir} --bind=/local:/local:rw /b06x-isilon/b06x-m/mnp_nanopore/software/clairs-to_latest.sif \
          /opt/bin/run_clairs_to \
          --tumor_bam_fn ${input_dir}/${id}.hg38.bam \
          --sample_name ${id} \
          --ref_fn ${ref_dir}/hg38.fa \
          --threads ${params.threads} \
          --platform ${params.platform} \
          --output_dir ${output_dir} \
          --conda_prefix /opt/micromamba/envs/clairs-to \

All the previous steps run fine and then it runs into this error at the phasing step:

[INFO] Phase the Tumor BAM
( parallel --joblog /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/logs/phasing_log/parallel_2_phase_tumor.log -j 8 /opt/micromamba/envs/clairs-to/bin/longphase phase  -s /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/vcf/{1}.vcf -b /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/bam/392718.calls.mods.sorted.hg38.bam -r /b06x-isilon/b06x-m/mnp_nanopore/software/hg38/hg38.fa -t 8 -o /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1} --ont :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS && parallel -j 8 bgzip -f /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1}.vcf :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS ) 2>&1 | tee /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/logs/phasing_log/2_phase_tumor.log && parallel -j 8 tabix -f -p vcf /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/phased_vcf_output/tumor_phased_{1}.vcf.gz :::: /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/CONTIGS

parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value
parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value
parsing VCF ... [W::vcf_parse] Contig '##contig=<ID=chr1,length=248387328>' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::bcf_hrec_check] Invalid contig name: "##contig=<ID=chr1,length=248387328>"
pos 0 missing GT value

The full log file is here- run_clairs_to.log

Could you please help me with this?

Many thanks, Areeba.

JasonCLEI commented 1 month ago

Hi, @areebapatel,

We have found another error message[ERROR] Failed to load reference sequence from file (/b06x-isilon/b06x-m/mnp_nanopore/software/hg38/hg38.fa). from your running log. Along with the error message pos 0 missing GT value, we believed that the program is not running normally, which could occur issues while reading and writing files. We suggest that you rerun ClairS-TO separately to see if the same issue still exists. If you have done so and still got the error, could you kindly provide your latest running log of run_clairs_to.log file and some VCF files stored in /b06x-isilon/b06x-m/mnp_nanopore/analysis/ONT_R00200/snv/ClairsTo/tmp/phasing_output/vcf/ like chr1.vcf? And we will check the issue for you further.


areebapatel commented 3 weeks ago

Hi Lei,

Thanks for your looking into this. I have checked that the reference file exists and is used successfully with other tools. Here are the VCF files for the run. chr19.txt chrY.txt

I am not sure I understand what you mean by running ClairS-TO separately. How should I do that?


JasonCLEI commented 2 weeks ago

Hi, @areebapatel,

We suggest that you run ClairS-TO again with the same command to see if the issue still exists. If you have done so and still got the error, we will check the issue for you further.
