NCGG-MGC / IMSindel

IMSindel: An accurate intermediate-size indel detection tool incorporating de novo assembly and gapped global-local alignment with split read analysis
https://www.nature.com/articles/s41598-018-23978-z
MIT License
15 stars 0 forks source link

Running time in HPC #21

Open alyazeeditalal opened 2 years ago

alyazeeditalal commented 2 years ago

I am running IMSindel in HPC However the analysis didn't finish for more than 24h

the command: IMSindel/bin/imsindel --bam APS14.downsampled.bam --chr chr7 --outd chr7_del_out --indelsize 10000 --reffa ~/A.frieburgensis_genome_assembly/Afre-v5_genome.fasta --glsearch fasta36/bin/glsearch36

Output: samtools version: samtools 1.3.1 Using htslib 1.3.1 Copyright (C) 2016 Genome Research Ltd.

glsearch version: USAGE glsearch36 [-options] query_file library_file glsearch36 -help for a complete option list

DESCRIPTION GLSEARCH performs a global-query/local-library search version: 36.3.8i Sept, 2021

COMMON OPTIONS (options must preceed query_file library_file) -s: [BL50] scoring matrix; -f: [-12] gap-open penalty; -g: [-2] gap-extension penalty; -S filter lowercase (seg) residues; -b: high scores reported (limited by -E by default); -d: number of alignments shown (limited by -E by default); -I interactive mode;

  1. collecting indel related reads... samtools view -F 1024 -f 2 APS14.downsampled.bam chr7

    backward_clips: 12038

    forward_clips: 12633

    non_clips: 88692

  2. collecting indel related reads...done
  3. collecting unmapped reads... samtools view -F 1024 -f 8 APS14.downsampled.bam chr7 mate_unmapped_read_names: 4333 samtools view -F 1024 -f 4 APS14.downsampled.bam chr7 Insert size Avg: 398.308920220808 SD: 131.37852966092996

    unmapped reads: 5977

  4. collecting unmapped reads...done
  5. considering support reads...

backward clip with support reads: 543

forward clip with support reads: 518

non_clips with suport reads: 3006

  1. considering support reads...done
  2. making consensus seqs from support reads...

It stopped here for hours.

I tried to change the location of the temporary files to --temp IMSindel/data. That still didn't solve the issue. The size of the original bam file is 18GB. I am running the analysis on downsampled bam file to only consider 25% coverage, size of the bam file 4.5GB.

Kind Regards,

Update: result was obtained the program took a long time over 3 days. Probably due to the size of my bam file.

holrock commented 2 years ago

I'm sorry for the late reply. Is there any output after 4.making consensus