RolandFaure / Hairsplitter

Software that separates very close sequences that have been collapsed during assembly. Uses only long reads.
GNU General Public License v3.0
33 stars 0 forks source link

ERROR: call_variants failed #3

Closed alexvasilikop closed 1 year ago

alexvasilikop commented 1 year ago

Hi Roland,

I still haven't been able to run the pipeline successfully. I am getting some error that the variant calling pipeline failed and also the warnings about the headers of the reads again. Please have a look:

assembly=/mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.chrom.interleaved.fasta
reads=/mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq
outdir=/mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom

python3 /mnt/sda1/Alex/software/Hairsplitter/hairsplitter.py -i $assembly -f $reads -x ont -t 12 -o $outdir -F 
/mnt/sda1/Alex/software/Hairsplitter/hairsplitter.py -i /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.chrom.interleaved.fasta -f /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq -x ont -t 12 -o /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom -F
HairSplitter v1.3.2 (github.com/RolandFaure/HairSplitter). Last update: 2023-08-11

    ******************
    *                *
    *  Hairsplitter  *
    *    Welcome!    *
    *                *
    ******************

===== STAGE 1: Cleaning graph of small contigs that are unconnected parts of haplotypes   [ 2023-08-11 17:11:11.723509 ]

 When the assemblers manage to locally phase the haplotypes, they sometimes assemble the alternative haplotype as a separate contig, unconnected in the gfa graph. This affects negatively the performance of Hairsplitter. Let's delete these contigs

 - Mapping the assembly against itself
 Running:  /mnt/sda1/Alex/software/Hairsplitter/src/build/clean_graph /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/hairsplitter.log 12 minimap2
 - Eliminated small unconnected contigs that align on other contigs

===== STAGE 2: Aligning reads on the reference   [ 2023-08-11 17:11:36.097181 ]

 - Converting the assembly in fasta format
 - Aligning the reads on the assembly
 - Running minimap with command line:
      minimap2 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.fasta /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq -x map-ont -a --secondary=no -t 12 > /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 2> /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/logminimap.txt 
   The log of minimap2 can be found at /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/logminimap.txt

===== STAGE 3: Calling variants   [ 2023-08-11 17:24:35.077202 ]

 Running:  /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf
 - Loading all reads from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq in memory
 - Loading all contigs from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa in memory
 - Loading alignments of the reads on the contigs from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam
 - Calling variants on each contig using basic pileup
double free or corruption (out)
Aborted (core dumped)
ERROR: call_variants failed. Was trying to run: /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf
RolandFaure commented 1 year ago

I will look into this, and try to understand before Monday.

FrostFlow13 commented 1 year ago

I actually think I had that issue earlier today (or at least had a similar issue)! I don't remember the exact steps I did to fix it, but I do remember that I ended up deleting the output directory (i.e. where you told Hairsplitter to send the output files), as well as navigated to where my input .fasta assembly was located and deleted a few files that had appeared there during Hairsplitter running (I think there was a .fasta.fai file and a file named "core.####", and I deleted both of them (the .fai file might have been from something else though, I'm unsure)).

I ran it again after that, and it seemed to progress past the "call_variants" step just fine. Sorry this isn't very specific for my potential "fix" - it was something I was doing in the middle of trying to troubleshoot something else, but it did seem to work (at least for me, or maybe it was coincidence).

Just so it's here, this is what I saw that led me to try the fix I proposed above:

*** Error in `/users/PAS1802/woodruff207/Hairsplitter/src/build/call_variants': double free or corruption (!prev): 0x0000000070de7800 ***
 - Loading all reads from ../1_demul_adtrim/BC15.fastq in memory
 - Loading all contigs from ../8_hairsplitter/tmp/cleaned_assembly.gfa in memory
 - Loading alignments of the reads on the contigs from ../8_hairsplitter/tmp/reads_on_asm.sam
 - Calling variants on each contig using basic pileup
/users/PAS1802/woodruff207/Hairsplitter/hairsplitter.py -f ../1_demul_adtrim/BC15.fastq -i 1376-haploid.fasta -x ont -o ../8_hairsplitter -t 28
HairSplitter v1.3.2 (github.com/RolandFaure/HairSplitter). Last update: 2023-08-11

    ******************
    *                *
    *  Hairsplitter  *
    *    Welcome!    *
    *                *
    ******************

===== STAGE 1: Cleaning graph of small contigs that are unconnected parts of haplotypes   [ 2023-08-11 11:59:47.451158 ]

 When the assemblers manage to locally phase the haplotypes, they sometimes assemble the alternative haplotype as a separate contig, unconnected in the gfa graph. This affects negatively the performance of Hairsplitter. Let's delete these contigs

 - Mapping the assembly against itself
 Running:  /users/PAS1802/woodruff207/Hairsplitter/src/build/clean_graph ../8_hairsplitter/tmp/assembly.gfa ../8_hairsplitter/tmp/cleaned_assembly.gfa ../8_hairsplitter ../8_hairsplitter/hairsplitter.log 28 minimap2
 - Eliminated small unconnected contigs that align on other contigs

===== STAGE 2: Aligning reads on the reference   [ 2023-08-11 11:59:49.468419 ]

 - Converting the assembly in fasta format
 - Aligning the reads on the assembly
 - Running minimap with command line:
      minimap2 ../8_hairsplitter/tmp/cleaned_assembly.fasta ../1_demul_adtrim/BC15.fastq -x map-ont -a --secondary=no -t 28 > ../8_hairsplitter/tmp/reads_on_asm.sam 2> ../8_hairsplitter/tmp/logminimap.txt 
   The log of minimap2 can be found at ../8_hairsplitter/tmp/logminimap.txt

===== STAGE 3: Calling variants   [ 2023-08-11 12:02:45.425662 ]

 Running:  /users/PAS1802/woodruff207/Hairsplitter/src/build/call_variants ../8_hairsplitter/tmp/cleaned_assembly.gfa ../1_demul_adtrim/BC15.fastq ../8_hairsplitter/tmp/reads_on_asm.sam 28 ../8_hairsplitter/tmp ../8_hairsplitter/tmp/error_rate.txt 0 ../8_hairsplitter/tmp/variants.col ../8_hairsplitter/tmp/variants.vcf
ERROR: call_variants failed. Was trying to run: /users/PAS1802/woodruff207/Hairsplitter/src/build/call_variants ../8_hairsplitter/tmp/cleaned_assembly.gfa ../1_demul_adtrim/BC15.fastq ../8_hairsplitter/tmp/reads_on_asm.sam 28 ../8_hairsplitter/tmp ../8_hairsplitter/tmp/error_rate.txt 0 ../8_hairsplitter/tmp/variants.col ../8_hairsplitter/tmp/variants.vcf
alexvasilikop commented 1 year ago

I tried with using the option -F (that overwrites the output directory) and for now the call_variants command is running. Deleting the output directory did not work for me. I will let you know if it runs successfully.

alexvasilikop commented 1 year ago

Update: The run failed even with the -F flag:

corrupted size vs. prev_size
Aborted (core dumped)
/mnt/sda1/Alex/software/Hairsplitter/hairsplitter.py -i /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.chrom.interleaved.fasta -f /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq -x ont -t 12 -o /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom -F
HairSplitter v1.3.2 (github.com/RolandFaure/HairSplitter). Last update: 2023-08-11

    ******************
    *                *
    *  Hairsplitter  *
    *    Welcome!    *
    *                *
    ******************

===== STAGE 1: Cleaning graph of small contigs that are unconnected parts of haplotypes   [ 2023-08-14 11:04:13.549246 ]

 When the assemblers manage to locally phase the haplotypes, they sometimes assemble the alternative haplotype as a separate contig, unconnected in the gfa graph. This affects negatively the performance of Hairsplitter. Let's delete these contigs

 - Mapping the assembly against itself
 Running:  /mnt/sda1/Alex/software/Hairsplitter/src/build/clean_graph /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/hairsplitter.log 12 minimap2
 - Eliminated small unconnected contigs that align on other contigs

===== STAGE 2: Aligning reads on the reference   [ 2023-08-14 11:04:32.704460 ]

 - Converting the assembly in fasta format
 - Aligning the reads on the assembly
 - Running minimap with command line:
      minimap2 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.fasta /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq -x map-ont -a --secondary=no -t 12 > /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 2> /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/logminimap.txt 
   The log of minimap2 can be found at /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/logminimap.txt

===== STAGE 3: Calling variants   [ 2023-08-14 11:16:26.772307 ]

 Running:  /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf
ERROR: call_variants failed. Was trying to run: /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf
RolandFaure commented 1 year ago

What happens when you run /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf ?

alexvasilikop commented 1 year ago

Here it is:

/mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 12 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf
 - Loading all reads from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq in memory
 - Loading all contigs from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa in memory
 - Loading alignments of the reads on the contigs from /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam
 - Calling variants on each contig using basic pileup
double free or corruption (!prev)
[1]    27204 abort (core dumped)  /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants    12   0  
RolandFaure commented 1 year ago

Does it fail when you disable multithreading ? (try running /mnt/sda1/Alex/software/Hairsplitter/src/build/call_variants /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/cleaned_assembly.gfa /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/Adineta_ricciae.ONT.BXQ_G.merged.filt.40000.90.1000.fastq /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/reads_on_asm.sam 1 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/error_rate.txt 0 /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.col /mnt/sda1/Alex/16.PHASED_ASSEMBLIES_HETEROZYGOSITY/Adineta_ricciae/hairsplitter_aricciae_chrom/tmp/variants.vcf )

RolandFaure commented 1 year ago

Ok, I'm getting close to the problem. It is a problem of multithreading, it should not happen if you launch HairSplitter with one thread. Keeping you up to date once I actually corrected it.

RolandFaure commented 1 year ago

The problem has been corrected in version 1.3.3. Thank you and don't hesitate if you come across any other bugs !

alexvasilikop commented 1 year ago

Hi Roland,

I am using version 1.3.4 and I am still getting the warning for some reads not being in the sam file:

WARNING: read in the sam file not found in reads file, ignoring: ch219_read44339_template_pass_FAK89779_1-47182
WARNING: read in the sam file not found in reads file, ignoring: ch135_read28305_template_pass_FAK89779_1-100239
WARNING: read in the sam file not found in reads file, ignoring: ch219_read43952_template_pass_FAK89779_10-42516
WARNING: read in the sam file not found in reads file, ignoring: ch393_read23811_template_pass_FAK89779_3-50636
WARNING: read in the sam file not found in reads file, ignoring: ch86_read40397_template_fail_FAK89779_17-72693
WARNING: read in the sam file not found in reads file, ignoring: ch412_read47033_template_pass_FAK89779_7-83775
WARNING: read in the sam file not found in reads file, ignoring: ch50_read44397_template_pass_FAK89779_9-51208
WARNING: read in the sam file not found in reads file, ignoring: ch248_read30930_template_pass_FAK89779_53-44748
WARNING: read in the sam file not found in reads file, ignoring: ch248_read31320_template_pass_FAK89779
WARNING: read in the sam file not found in reads file, ignoring: ch16_read26613_template_pass_FAK89779_17-72851
WARNING: read in the sam file not found in reads file, ignoring: ch118_read33696_template_pass_FAK89779_4-69317
WARNING: read in the sam file not found in reads file, ignoring: ch393_read24189_template_pass_FAK89779_1-44962
WARNING: read in the sam file not found in reads file, ignoring: ch412_read48333_template_pass_FAK89779_6-46862
WARNING: read in the sam file not found in reads file, ignoring: ch412_read47329_template_pass_FAK89779_15-76190
WARNING: read in the sam file not found in reads file, ignoring: ch322_read47295_template_fail_FAK89779_7-54714
WARNING: read in the sam file not found in reads file, ignoring: ch294_read28016_template_pass_FAK89779_2-62700
WARNING: read in the sam file not found in reads file, ignoring: ch79_read40570_template_fail_FAK89779_1-44400
WARNING: read in the sam file not found in reads file, ignoring: ch235_read17025_template_pass_FAK89779_3-54048
WARNING: read in the sam file not found in reads file, ignoring: ch451_read25548_template_pass_FAK89779_15-67538
WARNING: read in the sam file not found in reads file, ignoring: ch248_read32169_template_pass_FAK89779

Was this corrected in v.1.3.4? Thanks

RolandFaure commented 1 year ago

Hum, I do not remember. Have you looked at these reads and are they found in the fastq file ? If yes, you can send me the files. You can probably reproduce the issue with smaller files containing only read ch219_read44339_template_pass_FAK89779_1-47182 for example

alexvasilikop commented 1 year ago

The problem was that I used a zipped fastq file as input. When I unzipped and rerun hairsplitter the problem disappeared.

Thanks