Open afinit opened 8 years ago
please double check when calling the quiver command, it is from the SMRT-Analysis v3.0 directory. The error message looks like from Quiver of earlier version. (cmp.h5 is obsoleted. All quiver consensus will be done with bam files.)
I ran it again and called it directly from one of the SMRT-Analysis directories as:
smrtsuite/install/smrtsuite-fromsrc_3.0.2.170012/bundles/smrttools/install/smrttools-fromsrc_3.0.2.170012/smrtcmds/bin/variantCaller
This still caused the same error.
This points to:
smrtsuite/install/smrtsuite-fromsrc_3.0.2.170012/bundles/smrttools/install/smrttools-fromsrc_3.0.2.170012/private/pacbio/pythonpkgs/GenomicConsensus/binwrap/variantCaller
I also opened up the SMRT-Analysis binwrap/python and checked the version of GenomicConsensus provided with it. This gave v1.1.0
Shouldn't the GenomicConsensus v2.0.0 variantCaller be correct as well though?
I don't install SMRTanalysis and GenomicConsensus at the same time. I suspect there is some conflicting. The way SMRT analysis isolates its environment is subtle. When I run FALCON-unzip, my exception path typically does not include the SMRAnalysis or Genomics Consensus path at all.
what does your quiver line inside 4-quiver/*F/cns_*F.sh
look like?
this is what mine looks like
/mnt/secondary/builds/full/3.0.0/prod/current-build_smrtanalysis/smrtcmds/bin/variantCaller
Also, please post the results of echo $PATH
This was pulled from the 4-quiver/*F/cns_*F.sh
where $PB_RUN
is the root of the PacBio run.
($HOME/venv/falcon/bin/smrtcmds/variantCaller -x 5 -X 120 -q 20 -j 24 \
-r $PB_RUN/4-quiver/000001F_003/000001F_003_ref.fa aln-000001F_003.bam \
-o $PB_RUN/4-quiver/000001F_003/cns-000001F_003.fasta.gz \
-o $PB_RUN/4-quiver/000001F_003/cns-000001F_003.fastq.gz) \
|| echo quiver failed
I shortened it a bit and put it on multiple lines for readability. $HOME/venv/falcon/bin/smrtcmds/ contains links to the SMRT-Analysis smrtcmds/bin
Try variantCaller --version
or quiver --version
to get the version, please avoid digging into binwrap as much as possible.
SA3 should be able to recognize .bam file by nature, currently I am not very clear how this is setup.
If possible, you might want to share with us aln-000001F_003.bam and 000001F_003_ref.fa files and I can probably help excluding some factors.
As described above, from the command line I'm running version 2.0.0 of variantCaller and from the smrtsuite cmds I'm running version 1.1.0:
$ variantCaller --version
2.0.0
$ quiver --version
2.0.0
Here are the files that are being used as input for variantCaller
. These were both produced by other commands in fc_quiver.py
:
000001F_003_ref.fa.zip aln-000001F_003.bam.zip
Here is my $PATH
. I can try running variantCaller
without the smrtcmds in my $PATH
, but I don't know that I see how that might fix things.
$ echo $PATH
$HOME/venv/falcon/bin:$HOME/bin:/usr/local/bin:/opt/local/bin:/opt/local/sbin:/usr/bin:/usr/local/sbin:/usr/sbin:/shares/bioinfo/bin:/shares/bioinfo/installs/trinity:/shares/condor/bin:/usr/lib64/mpich/bin:$HOME/venv/falcon/bin/smrtcmds
I will reinstall FALCON-unzip tomorrow in a new virtualenv and see if that fixes things. Perhaps I've added some things to the $PATH that I'm not aware of.
in my case, the variantCaller
is not in the path so I use the full path to call it. The wrapper should take care its own environment. I would suggest you simply the $PATH
variable to isolate the environment to see if you can find any conflict. Also, check the generated an-000001F_003.bam
with samtools
to see if it is good.
bam file from @afinit doesn't seem to be compatible with the SA3's variantCaller. I got the same error while it works with @pb-jchin's examples.
~/ghtest$ ls -alG
total 1720
drwxr-xr-x 2 mhsieh 4096 Feb 3 23:06 .
drwxr-xr-x 95 mhsieh 20480 Feb 3 23:01 ..
-rw-r--r-- 1 mhsieh 15018 Feb 3 18:55 000001F_003_ref.fa
-rw-r--r-- 1 mhsieh 33 Feb 3 23:06 000001F_003_ref.fa.fai
-rw-r--r-- 1 mhsieh 1707473 Feb 3 18:55 aln-000001F_003.bam
-rw-r--r-- 1 mhsieh 1435 Feb 3 23:05 aln-000001F_003.bam.pbi
~/ghtest$ variantCaller -x 5 -X 120 -q 20 -j 2 -r 000001F_003_ref.fa -o test.fasta.gz -o test.fastq.gz aln-000001F_003.bam
Failure: The Quiver algorithm requires a cmp.h5 file containing standard (non-CCS) reads.
a possible conclusion here is that @afinit 's SA3 tar ball should be okay. Now let's check how these bam files were generated.
I am rerunning from a fresh install. In the meantime, would it be possible for me to have access to a test bam and fa? I could try to compare the bam header and entries to see if there are any glaring differences
Hi, @afinit yes, that is a good idea, I do have plan to build some testing data/ testing runs but it will have to wait until the AGBT meeting next week is over.
I received the same error again. I spent quite a bit of time digging through the code to see where the file is validated. I couldn't seem to find the end of the trail so I finally just commented out the readType check that is raising the error, (this line in quiver.py). I reran the code and the variantCaller line from the shell script runs fine.
I did just see that I didn't have .bam files for all of my raw reads files in input_bam.fofn. I noticed this, because a couple of the shell scripts returned this error: Input CmpH5 file must be nonempty.
referring to the aln*bam file used as input to variantCaller. This could explain the error, but I don't know why it would have run without the readType check. When run with the readType check, all of the shell scripts returned errors.
Hi, @afinit, I am still wondering how that is triggered. While I did see it before, I can reproduce it here. Anyway, @hayanlee also encountered something similar. I did an end-to-end check for quiver on my side last Friday with smrtanalysis 3.0.3, I can go through with out problem. Here is my working environment for your reference.
My PATH
variable is simple, standard UNIX PATH pre-pend with FALCON executable path:
$ echo $PATH
/home/UNIXHOME/jchin/build/falcon_latest_build/FALCON-integrate/fc_env/bin:/mnt/software/p/parallel/bin:/home/UNIXHOME/jchin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
I converted the *.bax.h5
files to *.bam
by calling the bax2bam
with full path like this:
bax2bam
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/bax2bam /pbi/collections/242/2420309/0001/Analysis_Results/m150715_213954_42175_c100867392550000001823195203031660_s1_p0.1.bax.h5 -o m150715_213954_42175_c100867392550000001823195203031660_s1_p0.1 &
Here is one example of the my cns_*.sh
inside the 4-quiver
directory:
$ cat 4-quiver/000028F/cns_000028F.sh
export PATH=/home/UNIXHOME/jchin/build/falcon_latest_build/FALCON-integrate/fc_env/bin:/mnt/software/p/parallel/bin:/home/UNIXHOME/jchin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:${PATH}
export PYTHONPATH=:${PYTHONPATH}
export LD_LIBRARY_PATH=:${LD_LIBRARY_PATH}
set -vex
trap 'touch /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_quiver_done.exit' EXIT
cd /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/./4-quiver/000028F
hostname
date
cd /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/./4-quiver/000028F
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/samtools faidx /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/samtools view -b -S /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/reads/000028F.sam > 000028F.bam
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/pbalign --tmpDir=/localdisk/scratch/ --nproc=24 --minAccuracy=0.75 --minLength=50 --minAnchorSize=12 --maxDivergence=30 --concordant --algorithm=blasr --algorithmOptions=-useQuality --maxHits=1 --hitPolicy=random --seed=1 000028F.bam /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam
#/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/makePbi --referenceFasta /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam
(/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/variantCaller -x 5 -X 120 -q 20 -j 24 -r /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam -o /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/cns-000028F.fasta.gz -o /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/cns-000028F.fastq.gz) || echo quvier failed
date
touch /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_quiver_done
I could reproduce the same issue when using bam files converted from bax.h5 files generated by an RSII machine in the latest FALCON unzip version [Latest commit 7ebc99c on Dec 22, 2016]. This was using smrtlink_4.0.0.190159 and arrow as correction.
I commented out the following lines in the arrow script found in smrtlink_4.0.0.190159/install/smrtlink-fromsrc_4.0.0.190159+190159-190159-189856-189856-189856/bundles/smrttools/install/smrttools-fromsrc_4.0.0.190159/private/pacbio/pythonpkgs/GenomicConsensus/lib/python2.7/site-packages/GenomicConsensus/arrow/arrow.py
253 #if alnFile.readType != "standard": 254 # raise U.IncompatibleDataException( 255 # "The Arrow algorithm requires a BAM file containing standard (non-CCS) reads." )
Worked just fine afterwards.
This issue seems persist in the latest release of smrtlink and falcon. I downloaded falcon from https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries, and I tried both SMRTLINK v 3.1.1 and v 4.0.0
This is the command to generate the bam file: /home/support/zhan2142/smrtlink400/smrtcmds/bin/samtools faidx /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/quiver_scatter/000197F/000197F_ref.fa
and this is the failure: /home/support/zhan2142/smrtlink400/smrtcmds/bin/variantCaller --algorithm=arrow -x 5 -X 120 -q 20 -j 24 -r /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/quiver_scatter/000197F/000197F_ref.fa aln-000197F.bam -o /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/000197F/cns-000197F.fasta.gz -o /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/000197F/cns-000197F.fastq.gz || echo quvier failed Failure: The Arrow algorithm requires a BAM file containing standard (non-CCS) reads. quvier failed
However, previously when I used a test version of smrtlink_3.1.1.182868.zip (by courtesy of Laura Nolden), I didn't have the issue. (Actually, I had another non-related issue with quiver and bam, but manually fixed it.)
Anyway, I commented out the three lines as people mentioned above, and issue resolved.
I'm currently having issues with the Quiver step of FALCON_unzip. Everything seems to run fine up to running
fc_quiver.py
. At this point it prepares the shell scripts to run each contig group and it looks like these scripts do what they are supposed to until it gets to Quiver. Then I get an error saying Quiver requires a cmp.h5 file. I've provided a shortened example of the command and error below. I also tried running this command by itself from the command line and I get the same error.Software versions:
Since I have two versions of variantCaller, I tried them both, but they both result in the same error. I assume this means that I have a dependency that needs to be further updated, but I can't figure out which one.