WangLab-SCSIO / Prophage_Tracer

Prophage Tracer: precisely tracing prophages in prokaryotic genomes using overlapping split-read alignment
GNU General Public License v3.0
22 stars 1 forks source link

Division by zero error? #2

Open leannmlindsey opened 2 years ago

leannmlindsey commented 2 years ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted

I followed your directions to index with bwa, then map with bwa then use samtools and everything worked up to this point in the script:

${prophage_dir}/prophage_tracer.sh -m ${output_dir}/test_strain.rmdup.sam -r ${ref_genome} -p test_strain

Do you have any test data that I can try to see if it is a problem with the input data?

Thank you LeAnn

WangLab-SCSIO commented 2 years ago

Hi LeAnn,

Thank you for using Prophage_Tracer. I think this error may be caused the command on line 81 in the prophage_tracer.sh script.

insert_size=head -n 10000 $sam_file | awk '($2 ~ /163|83|99|147/ )' | cut -f9 | awk '{print sqrt($0^2)}' | awk '$0<10000'| awk '{ sum += $0;i++ } END { print int(sum/i) }'``

The purpose of this command is to estimate the insert_size of the DNA inserted fragment size of your constructed sequencing library. It is usually ~500 bp for paired-end libraries when resequencing bacterial genomes.

Therefore, there may be a few situations caused this error.

  1. Your sequencing library is a single-end library.
  2. Incorrect mapping reads to your reference genome. If in this case, it may caused by using a wrong reference genome.
  3. If you ensure your sequencing library is a paired-end library and the mapping is correct. You can manually change this command to insert_size=500.
  4. You can also show me the first few lines of your SAM file. You can run the command head test_strain.rmdup.sam in your terminal and show me the output.

Kaihao Tang

leannmlindsey commented 2 years ago

Thank you for your quick response. I was using a single-end library, so I will try again will a paired end library and let you know if I have any further trouble. Thank you.

WangLab-SCSIO commented 2 years ago

I will update my script to be suitable for analysis of a single-end library. This is on my schedule. But if your analysis is very urgent, you can re-run using your single-end library data with reserving temporary files. In order to reserve temporary files, you should firstly delete the command on line 810 rm contiglength.file $prefix.sr.temp.1 $prefix.reads.fasta makeblastdb.log blastn.log $prefix.sr.temp.2 $prefix.sr.temp.3 $prefix.sr.temp.out $prefix.drp.temp.1 $prefix.sr-drp.temp.out $prefix.drp.temp.2 $prefix.drp.temp.left $prefix.drp.temp.out $prefix.temp.out $prefix.nuclDB.*. The temporary files ends with "sr.temp.2" "sr.temp.3" "sr.temp.out" may contain candidate prophage information. These files contain only coordinate information. If you can not read through these files, you can send these three files to me and I can manually check these file and give you information of candidate prophages/

seharmaaz commented 11 months ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted. Please help me how i can fix this issue?

WangLab-SCSIO commented 11 months ago

I tried running your Prophage_Tracer for the first time today and I got this error awk: cmd. line:1: fatal: division by zero attempted. Please help me how i can fix this issue?

Hi, seharmaaz As I responsed above, I think this error may be caused the command on line 81 in the prophage_tracer.sh script. You can modify de script as suggested or provided more information about how you generated your SAM file.

saif-asghar commented 10 months ago

Hello WangLab-SCSIO, Could you possibly provide test directory in the repository which contains sample files ( ex. sample.fasta, 1.fastq.gz, 2.fastq.gz etc. ) for execution to better understand the journey to the output by implementing a test run.

WangLab-SCSIO commented 3 weeks ago

Hello WangLab-SCSIO, Could you possibly provide test directory in the repository which contains sample files ( ex. sample.fasta, 1.fastq.gz, 2.fastq.gz etc. ) for execution to better understand the journey to the output by implementing a test run.

I have updated our codes and provided test data ( see README file). Please try to download the new shell scripts and the test data. Try to run on the test data. Refer to the problem you met, it is probable that no split reads or discordant read pairs derived from prophages were found in your data.

WangLab-SCSIO commented 3 weeks ago

Hello WangLab-SCSIO, Could you possibly provide test directory in the repository which contains sample files ( ex. sample.fasta, 1.fastq.gz, 2.fastq.gz etc. ) for execution to better understand the journey to the output by implementing a test run.

It should be noted that some users reported that blast+ 2.16.0 on Ubuntu may generate error and exit the shell script when using makeblastdb. However, this bulit DB is ok for downstream command. Therefore, I have updated the shell script to ignore this error message and continue with the following command. I also suggest to install blast+ 2.6.0 as mentioned in "System and software requirements".