iqbal-lab / cortex

reference free variant assembly
32 stars 13 forks source link

Bug in arg-parsing for Joint workflow with no reference (ref Absent) #32

Closed iqbal-lab closed 1 year ago

iqbal-lab commented 1 year ago

Jason Shi was reporting an error during using run_calls with the Joint workflow and ref=Absent

Commandline was

perl /home/ubuntu/downloads/cortex/scripts/calling/run_calls.pl --first_kmer 31 --kmer_step 30 --fastaq_index INDEX --auto_cleaning yes --bc yes --pd no --outdir cortex_out_20_100_absent --outvcf s101346_var --ploidy 1 --genome_size 5000000 --max_read_len 500 --qthresh 5 --mem_height 20 --mem_width 100 --vcftools_dir /home/ubuntu/downloads/vcftoolss-0.1.16/ --do_union yes --ref Absent --workflow joint --logfile logfile_20_100_absent log.txt

Variant calling happened fine, but when process_calls.pl was called, it complained:

_If you have not used a reference in the Cortex graph, then do not specify --refcol and do not specify --ref_fasta

* Command-line used was : *** perl /home/ubuntu/downloads/cortex//scripts/analyse_variants/process_calls.pl --callfile /raid1/short_reads/example/test_cortex/cortex_out_20_100_absent/calls/joint_callsets/bubble_calls_joint_exc_ref_from_callingkmer31_cleaning_level0 --callfile_log /raid1/short_reads/example/test_cortex/cortex_out_20_100_absent/calls/joint_callsets/joint_varcalling_exc_ref_from_calling_k31_cleanlevel0.log --kmer 31 --caller BC --outdir /raid1/short_reads/example/test_cortex/cortex_out_20_100_absent/vcfs/ --outvcf s101346_var_union_joint_BC_calls_k31_clean_level0 --samplename_list /raid1/short_reads/example/test_cortex/cortex_out_20_100_absent/vcfs//SAMPLES --num_cols 45 --ploidy 1 --vcftools_dir /home/ubuntu/downloads/vcftoolss-0.1.16/ --global_var_ctr 0 --prefix k31_cl0_BC


You can either use a reference, or not. If you use a reference, specify --refcol AND --reffasta.

this is , amazingly, a bug in arg parsing that must have been there for some time. I've got records of myself running the same command (effectively) as Jason, and it working fine. But anyway there is a clear bug here:

https://github.com/iqbal-lab/cortex/blob/c8147152cd4015c45057900e8fb600376d1d7fb3/scripts/analyse_variants/process_calls.pl#L300

Note that ref_fasta is initialised to "unspecified" but the test here is checking if it is "".

Jason tested locally and switching that over fixed the problem.

I'm fixing now, but I want to reinstate my local test infra for this kind of thing

iqbal-lab commented 1 year ago

Fixed in 00a049ce8c2b4ffe29c67b3028d56cb29a9696c6