vpc-ccg / pamir

Discovery and Genotyping of Novel Sequence Insertions in Many Sequenced Individuals
BSD 3-Clause "New" or "Revised" License
8 stars 4 forks source link

Invalid Samtools Sort Command #6

Closed charliecurnin closed 4 years ago

charliecurnin commented 6 years ago

Running pamir.py, I'm getting

=============================================
Project Name      : pamir_test_13
Working Directory : /scratch/reconstructIns/pamir_test_13
=============================================
Checking the project pre-requisites...  FAILED
Pamir can not overwrite an existing project. Please add --resume or change project name.
[ccurnin@sh-ln03 login! /scratch/reconstructIns]$ cat slurm-16219603.out 
=============================================
Project Name      : pamir_test_14
Working Directory : /scratch/reconstructIns/pamir_test_14
=============================================
Creating a new project folder... OK
Checking binary pre-requisites...  ... OK
Sorting bam file... FAILED
/home/users/ccurnin/miniconda3/bin/samtools sort -n -@ 1 -m 10G /scratch/reconstructIns/pamir_test_14/indels.svs.bam.sort.bam /scratch/reconstructIns/pamir_test_14/indels.svs.bam.sort.bam.sorted failed with exit status 1 and message

I'm not sure this is a valid samtools command? When I try to run , I get [bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files.

This is samtools 1.7

samtools 1.7
Using htslib 1.7
Copyright (C) 2018 Genome Research Ltd.
yenyilin commented 6 years ago

I am not quite sure the reason of detecting your samtools as the version 0.1.*. Can you help to show the line starting with Usage in the pamir_test_14/log/samtools.version ?

Thank you!

charliecurnin commented 6 years ago

This the entire samtools.version file

[ccurnin@sh-ln02 login! /scratch/reconstructIns]$ cat pamir_test_14/log/samtools.version
CMD:/home/users/ccurnin/miniconda3/bin/samtools sort || true
?BC?sr?d``p??J/JK,Iz%?BC
yenyilin commented 6 years ago

That's weird. Usually it should store the whole prompt message from samtools like

CMD:/home/users/ccurnin/miniconda3/bin/samtools sort || true Usage: samtools sort [options...] [in.bam]

(1) What happened if you interactively type the command (/home/users/ccurnin/miniconda3/bin/samtools sort || true) in your shell?

(2) A ugly workaround is to change line 1132 of pamir.py from return ver to return 1. You can also do that to see if it solves this samtools version issue.

Thanks for your feedback.

charliecurnin commented 6 years ago
[ccurnin@sh-ln04 login! /scratch/reconstructIns]$ /home/users/ccurnin/miniconda3/bin/samtools sort || true
Usage: samtools sort [options...] [in.bam]
Options:
  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
  -n         Sort by read name
  -t TAG     Sort by value of TAG. Uses position as secondary index (or read name if -n is set)
  -o FILE    Write final output to FILE rather than standard output
  -T PREFIX  Write temporary files to PREFIX.nnnn.bam
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
  -O, --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
  -@, --threads INT
               Number of additional threads to use [0]

Will try modifying line 1132 and let you know how it goes. It's possible that my disk was full the last time I was running Pamir; then maybe when writing to control_file it was truncated.

yenyilin commented 6 years ago

Hope it's due to the disk space issue that samtools message was not correctly logged into the file. It does not even look like a typical system error. Let me know if this workaround does not solve your problem.

Thank you for the feedbacks!

charliecurnin commented 6 years ago

Doesn't look like it was a disk space issue. I know have several terabytes free and samtools.version is still

[ccurnin@sh-ln04 login! /scratch/reconstructIns/pamir_test/log]$ cat samtools.version 
CMD:/home/users/ccurnin/miniconda3/bin/samtools sort || true
?BC?sr?d``p??J/JK,Iz%?BC

Also, when I changed line 1132, I now get this error

=============================================
Project Name      : pamir_test
Working Directory : /scratch/reconstructIns/pamir_test
=============================================
Creating a new project folder... OK
Checking binary pre-requisites...  ... OK
Sorting bam file... OK
Extracting FASTQ from Alignment file... FAILED
/home/groups/XXXX/apps/bin/pamir-installs/5/pamir/pamir verify_sam /scratch/reconstructIns/pamir_test/indels.svs.bam.sort.bam.sorted.bam /scratch/reconstructIns/pamir_test/pamir_test.fastq failed with exit status 1 and message N/A

Neither that bam file nor that fastq seem to exist. The pamir_test directory only has

lrwxrwxrwx 1 ccurnin XXXX         66 May  7 14:15 GRCh37.fa -> /home/groups/XXXX/apps/bcbio/genomes/Hsapiens/GRCh37/seq/GRCh37.fa
lrwxrwxrwx 1 ccurnin XXXX         86 May  7 14:15 indels.svs.bam.sort.bam -> /scratch/simulate/final-widenSVWindows/indels.svs.bam.sort.bam
-rw-r--r-- 1 ccurnin XXXX 3315729704 May  7 14:27 indels.svs.bam.sort.bam.sorted.bam
drwxr-sr-x 2 ccurnin XXXX       4096 May  7 14:15 jobs
drwxr-sr-x 2 ccurnin XXXX       4096 May  7 14:27 log
-rw-r--r-- 1 ccurnin XXXX          0 May  7 14:15 mask.txt
drwxr-sr-x 2 ccurnin XXXX       4096 May  7 14:15 pbs
-rw-r--r-- 1 ccurnin XXXX        536 May  7 14:15 project.config
drwxr-sr-x 2 ccurnin XXXX       4096 May  7 14:27 stage

I'm not sure why I'm unable to run it? I'm submitting to Slurm with 15GB like this

sbatch -p owners --mem=15G --wrap="python $pamir -p $testDir -r $fasta --files alignment=${bam}"
yenyilin commented 6 years ago

Do you mind to share the first, say, 100 lines of the input file for me to check what happened if that does not bring you much trouble? samtools view -h indels.svs.bam.sort.bam.sorted.bam |head -n 100 > test.sam.

Thank you for all your feedbacks!

charliecurnin commented 6 years ago

Sure. Do you have an email?

yenyilin commented 6 years ago

My gmail account is identical to my github account, yenyilin.

charliecurnin commented 6 years ago

Sent!

charliecurnin commented 6 years ago

Hi, could you let me know why I'm having these problems with the BAM file I sent you? thanks for your help.

yenyilin commented 4 years ago

Caused by undefined behaviour when determining file size using fseek() nad ftell() described in Subclause 7.21.9.2 of the C standard. Fixed in 2.0.