HKU-BAL / ClairS

ClairS - a deep-learning method for long-read somatic small variant calling
BSD 3-Clause "New" or "Revised" License
67 stars 7 forks source link

select_hetero_snp_for_phasing breaks for contigs where no variants are found #5

Closed xingyaoc closed 1 year ago

xingyaoc commented 1 year ago

VCFs are not generated for contigs where no variants are found, which breaks select_hetero_snp_for_phasing. clairs_output/vcf contains these files:

chr1.vcf chr16.vcf chr21.vcf chr22.vcf chr4.vcf chr5.vcf

Errors pasted from some of 1_select_hetero_snp_for_phasing.log:

[INFO] Total HET SNP calls selected: chr22: 119, not found:27, not match:0, low_qual_count:0. Total normal:57 Total tumor:146, pro: 0.815068493150685 Traceback (most recent call last): File "/opt/bin/clairs.py", line 120, in main() File "/opt/bin/clairs.py", line 114, in main submodule.main() File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 157, in main select_hetero_snp_for_phasing(args) File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 118, in select_hetero_snp_for_phasing pro = len(pass_variant_dict) / len(tumor_qual_dict) ZeroDivisionError: division by zero Traceback (most recent call last): File "/opt/bin/clairs.py", line 120, in main() File "/opt/bin/clairs.py", line 114, in main submodule.main() File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 157, in main select_hetero_snp_for_phasing(args) File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 118, in select_hetero_snp_for_phasing pro = len(pass_variant_dict) / len(tumor_qual_dict) ZeroDivisionError: division by zero [INFO] Total HET SNP calls selected: chr1: 220, not found:66, not match:0, low_qual_count:0. Total normal:73 Total tumor:286, pro: 0.7692307692307693 [INFO] Total HET SNP calls selected: chr21: 1, not found:1, not match:0, low_qual_count:0. Total normal:31 Total tumor:2, pro: 0.5 Traceback (most recent call last): File "/opt/bin/clairs.py", line 120, in main() File "/opt/bin/clairs.py", line 114, in main submodule.main() File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 157, in main select_hetero_snp_for_phasing(args) File "/opt/bin/src/select_hetero_snp_for_phasing.py", line 118, in select_hetero_snp_for_phasing pro = len(pass_variant_dict) / len(tumor_qual_dict) ZeroDivisionError: division by zero

zhengzhenxian commented 1 year ago

Hi,

Seems no germline variants output found, could you send the ${OUTPUT_DIR}/run_clairs.log to my email address(zxzheng@cs.hku.hk) for us to pinpoint the error?

xingyaoc commented 1 year ago

Hi,

Thanks for the response! To give you a bit of context, I am testing ClairS on a random subset of split bams (30/3000 from each tumor and normal, then I merge the bam subset and pass it into ClairS). This is probably why very few variants are being found for both germline and somatic. I have attached the log file as requested.

Best, Xingyao

zhengzhenxian commented 1 year ago

Hi, Xingyao,

Thanks for the logs. Some contigs did have no SNPs output owing to the small BAM size. We have fixed the error and please try to re-pull the docker image(You might need to remove your local image first using dcoker rmi hkubal/clairs:latest )

Zhenxian