WangLab-SCSIO / Prophage_Tracer

Prophage Tracer: precisely tracing prophages in prokaryotic genomes using overlapping split-read alignment
GNU General Public License v3.0
22 stars 1 forks source link

Division by zero error? #2

Open leannmlindsey opened 2 years ago

leannmlindsey commented 2 years ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted

I followed your directions to index with bwa, then map with bwa then use samtools and everything worked up to this point in the script:

${prophage_dir}/prophage_tracer.sh -m ${output_dir}/test_strain.rmdup.sam -r ${ref_genome} -p test_strain

Do you have any test data that I can try to see if it is a problem with the input data?

Thank you LeAnn

WangLab-SCSIO commented 2 years ago

Hi LeAnn,

Thank you for using Prophage_Tracer. I think this error may be caused the command on line 81 in the prophage_tracer.sh script.

insert_size=head -n 10000 $sam_file | awk '($2 ~ /163|83|99|147/ )' | cut -f9 | awk '{print sqrt($0^2)}' | awk '$0<10000'| awk '{ sum += $0;i++ } END { print int(sum/i) }'``

The purpose of this command is to estimate the insert_size of the DNA inserted fragment size of your constructed sequencing library. It is usually ~500 bp for paired-end libraries when resequencing bacterial genomes.

Therefore, there may be a few situations caused this error.

  1. Your sequencing library is a single-end library.
  2. Incorrect mapping reads to your reference genome. If in this case, it may caused by using a wrong reference genome.
  3. If you ensure your sequencing library is a paired-end library and the mapping is correct. You can manually change this command to insert_size=500.
  4. You can also show me the first few lines of your SAM file. You can run the command head test_strain.rmdup.sam in your terminal and show me the output.

Kaihao Tang

leannmlindsey commented 2 years ago

Thank you for your quick response. I was using a single-end library, so I will try again will a paired end library and let you know if I have any further trouble. Thank you.

WangLab-SCSIO commented 2 years ago

I will update my script to be suitable for analysis of a single-end library. This is on my schedule. But if your analysis is very urgent, you can re-run using your single-end library data with reserving temporary files. In order to reserve temporary files, you should firstly delete the command on line 810 rm contiglength.file $prefix.sr.temp.1 $prefix.reads.fasta makeblastdb.log blastn.log $prefix.sr.temp.2 $prefix.sr.temp.3 $prefix.sr.temp.out $prefix.drp.temp.1 $prefix.sr-drp.temp.out $prefix.drp.temp.2 $prefix.drp.temp.left $prefix.drp.temp.out $prefix.temp.out $prefix.nuclDB.*. The temporary files ends with "sr.temp.2" "sr.temp.3" "sr.temp.out" may contain candidate prophage information. These files contain only coordinate information. If you can not read through these files, you can send these three files to me and I can manually check these file and give you information of candidate prophages/

seharmaaz commented 9 months ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted. Please help me how i can fix this issue?

WangLab-SCSIO commented 9 months ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted. Please help me how i can fix this issue?

Hi, seharmaaz As I responsed above, I think this error may be caused the command on line 81 in the prophage_tracer.sh script. You can modify de script as suggested or provided more information about how you generated your SAM file.

saif-asghar commented 8 months ago

Hello WangLab-SCSIO, Could you possibly provide test directory in the repository which contains sample files ( ex. sample.fasta, 1.fastq.gz, 2.fastq.gz etc. ) for execution to better understand the journey to the output by implementing a test run.