Failure: The Quiver algorithm requires a cmp.h5 file #12

afinit commented 8 years ago

I'm currently having issues with the Quiver step of FALCON_unzip. Everything seems to run fine up to running At this point it prepares the shell scripts to run each contig group and it looks like these scripts do what they are supposed to until it gets to Quiver. Then I get an error saying Quiver requires a cmp.h5 file. I've provided a shortened example of the command and error below. I also tried running this command by itself from the command line and I get the same error.

$SMRT_CMDS/variantCaller -x 5 -X 120 -q 20 -j 24 \
    -r $PB/4-quiver/000001F_003/000001F_003_ref.fa aln-000001F_003.bam \
    -o $PB/4-quiver/000001F_003/cns-000001F_003.fasta.gz \
    -o $PB/4-quiver/000001F_003/cns-000001F_003.fastq.gz

Failure: The Quiver algorithm requires a cmp.h5 file containing standard (non-CCS) reads.

Software versions:

SMRT-Analysis v3.0   => variantCaller v1.1.0
GenomicConsensus v2.0.0   => variantCaller v2.0.0
pbcore v1.2.7
ConsensusCore v1.0.1

Since I have two versions of variantCaller, I tried them both, but they both result in the same error. I assume this means that I have a dependency that needs to be further updated, but I can't figure out which one.

pb-jchin commented 8 years ago

please double check when calling the quiver command, it is from the SMRT-Analysis v3.0 directory. The error message looks like from Quiver of earlier version. (cmp.h5 is obsoleted. All quiver consensus will be done with bam files.)

afinit commented 8 years ago

I ran it again and called it directly from one of the SMRT-Analysis directories as:


This still caused the same error.

This points to:


I also opened up the SMRT-Analysis binwrap/python and checked the version of GenomicConsensus provided with it. This gave v1.1.0

afinit commented 8 years ago

Shouldn't the GenomicConsensus v2.0.0 variantCaller be correct as well though?

pb-jchin commented 8 years ago

I don't install SMRTanalysis and GenomicConsensus at the same time. I suspect there is some conflicting. The way SMRT analysis isolates its environment is subtle. When I run FALCON-unzip, my exception path typically does not include the SMRAnalysis or Genomics Consensus path at all.

pb-jchin commented 8 years ago

what does your quiver line inside 4-quiver/*F/cns_* look like?

this is what mine looks like


Also, please post the results of echo $PATH

afinit commented 8 years ago

This was pulled from the 4-quiver/*F/cns_* where $PB_RUN is the root of the PacBio run.

($HOME/venv/falcon/bin/smrtcmds/variantCaller -x 5 -X 120 -q 20 -j 24 \
    -r $PB_RUN/4-quiver/000001F_003/000001F_003_ref.fa aln-000001F_003.bam \
    -o $PB_RUN/4-quiver/000001F_003/cns-000001F_003.fasta.gz \
    -o $PB_RUN/4-quiver/000001F_003/cns-000001F_003.fastq.gz) \
    || echo quiver failed

I shortened it a bit and put it on multiple lines for readability. $HOME/venv/falcon/bin/smrtcmds/ contains links to the SMRT-Analysis smrtcmds/bin

mhsieh commented 8 years ago

Try variantCaller --version or quiver --version to get the version, please avoid digging into binwrap as much as possible.

SA3 should be able to recognize .bam file by nature, currently I am not very clear how this is setup.

If possible, you might want to share with us aln-000001F_003.bam and 000001F_003_ref.fa files and I can probably help excluding some factors.

afinit commented 8 years ago

As described above, from the command line I'm running version 2.0.0 of variantCaller and from the smrtsuite cmds I'm running version 1.1.0:

$ variantCaller --version
$ quiver --version

Here are the files that are being used as input for variantCaller. These were both produced by other commands in

Here is my $PATH. I can try running variantCaller without the smrtcmds in my $PATH, but I don't know that I see how that might fix things.

$ echo $PATH
afinit commented 8 years ago

I will reinstall FALCON-unzip tomorrow in a new virtualenv and see if that fixes things. Perhaps I've added some things to the $PATH that I'm not aware of.

pb-jchin commented 8 years ago

in my case, the variantCaller is not in the path so I use the full path to call it. The wrapper should take care its own environment. I would suggest you simply the $PATH variable to isolate the environment to see if you can find any conflict. Also, check the generated an-000001F_003.bam with samtools to see if it is good.

mhsieh commented 8 years ago

bam file from @afinit doesn't seem to be compatible with the SA3's variantCaller. I got the same error while it works with @pb-jchin's examples.

~/ghtest$ ls -alG
total 1720
drwxr-xr-x  2 mhsieh    4096 Feb  3 23:06 .
drwxr-xr-x 95 mhsieh   20480 Feb  3 23:01 ..
-rw-r--r--  1 mhsieh   15018 Feb  3 18:55 000001F_003_ref.fa
-rw-r--r--  1 mhsieh      33 Feb  3 23:06 000001F_003_ref.fa.fai
-rw-r--r--  1 mhsieh 1707473 Feb  3 18:55 aln-000001F_003.bam
-rw-r--r--  1 mhsieh    1435 Feb  3 23:05 aln-000001F_003.bam.pbi
~/ghtest$ variantCaller -x 5 -X 120 -q 20 -j 2 -r 000001F_003_ref.fa -o test.fasta.gz -o test.fastq.gz aln-000001F_003.bam 
Failure: The Quiver algorithm requires a cmp.h5 file containing standard (non-CCS) reads.
mhsieh commented 8 years ago

a possible conclusion here is that @afinit 's SA3 tar ball should be okay. Now let's check how these bam files were generated.

afinit commented 8 years ago

I am rerunning from a fresh install. In the meantime, would it be possible for me to have access to a test bam and fa? I could try to compare the bam header and entries to see if there are any glaring differences

pb-jchin commented 8 years ago

Hi, @afinit yes, that is a good idea, I do have plan to build some testing data/ testing runs but it will have to wait until the AGBT meeting next week is over.

afinit commented 8 years ago

I received the same error again. I spent quite a bit of time digging through the code to see where the file is validated. I couldn't seem to find the end of the trail so I finally just commented out the readType check that is raising the error, (this line in I reran the code and the variantCaller line from the shell script runs fine.

I did just see that I didn't have .bam files for all of my raw reads files in input_bam.fofn. I noticed this, because a couple of the shell scripts returned this error: Input CmpH5 file must be nonempty. referring to the aln*bam file used as input to variantCaller. This could explain the error, but I don't know why it would have run without the readType check. When run with the readType check, all of the shell scripts returned errors.

pb-jchin commented 8 years ago

Hi, @afinit, I am still wondering how that is triggered. While I did see it before, I can reproduce it here. Anyway, @hayanlee also encountered something similar. I did an end-to-end check for quiver on my side last Friday with smrtanalysis 3.0.3, I can go through with out problem. Here is my working environment for your reference.

My PATH variable is simple, standard UNIX PATH pre-pend with FALCON executable path:

$ echo $PATH

I converted the *.bax.h5 files to *.bam by calling the bax2bam with full path like this:

/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/bax2bam /pbi/collections/242/2420309/0001/Analysis_Results/m150715_213954_42175_c100867392550000001823195203031660_s1_p0.1.bax.h5 -o m150715_213954_42175_c100867392550000001823195203031660_s1_p0.1 &

Here is one example of the my cns_*.sh inside the 4-quiver directory:

$ cat 4-quiver/000028F/

export PATH=/home/UNIXHOME/jchin/build/falcon_latest_build/FALCON-integrate/fc_env/bin:/mnt/software/p/parallel/bin:/home/UNIXHOME/jchin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:${PATH}
set -vex
trap 'touch /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_quiver_done.exit' EXIT
cd /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/./4-quiver/000028F
cd /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/./4-quiver/000028F
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/samtools faidx /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/samtools view -b -S /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/reads/000028F.sam > 000028F.bam
/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/pbalign --tmpDir=/localdisk/scratch/ --nproc=24 --minAccuracy=0.75 --minLength=50            --minAnchorSize=12 --maxDivergence=30 --concordant --algorithm=blasr            --algorithmOptions=-useQuality --maxHits=1 --hitPolicy=random --seed=1            000028F.bam /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam
#/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/makePbi --referenceFasta /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam
(/mnt/secondary/builds/full/3.0.3/prod/smrtanalysis_3.0.3.172135/smrtcmds/bin/variantCaller -x 5 -X 120 -q 20 -j 24 -r /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_ref.fa aln-000028F.bam            -o /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/cns-000028F.fasta.gz -o /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/cns-000028F.fastq.gz) || echo quvier failed
touch /lustre/hpcprod/jchin/JGI_fungal_diploid_0.4+/4-quiver/000028F/000028F_quiver_done
BenjaminSchwessinger commented 7 years ago

I could reproduce the same issue when using bam files converted from bax.h5 files generated by an RSII machine in the latest FALCON unzip version [Latest commit 7ebc99c on Dec 22, 2016]. This was using smrtlink_4.0.0.190159 and arrow as correction.

I commented out the following lines in the arrow script found in smrtlink_4.0.0.190159/install/smrtlink-fromsrc_4.0.0.190159+190159-190159-189856-189856-189856/bundles/smrttools/install/smrttools-fromsrc_4.0.0.190159/private/pacbio/pythonpkgs/GenomicConsensus/lib/python2.7/site-packages/GenomicConsensus/arrow/

253 #if alnFile.readType != "standard": 254 # raise U.IncompatibleDataException( 255 # "The Arrow algorithm requires a BAM file containing standard (non-CCS) reads." )

Worked just fine afterwards.

yingzhang121 commented 7 years ago

This issue seems persist in the latest release of smrtlink and falcon. I downloaded falcon from, and I tried both SMRTLINK v 3.1.1 and v 4.0.0

This is the command to generate the bam file: /home/support/zhan2142/smrtlink400/smrtcmds/bin/samtools faidx /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/quiver_scatter/000197F/000197F_ref.fa

and this is the failure: /home/support/zhan2142/smrtlink400/smrtcmds/bin/variantCaller --algorithm=arrow -x 5 -X 120 -q 20 -j 24 -r /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/quiver_scatter/000197F/000197F_ref.fa aln-000197F.bam -o /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/000197F/cns-000197F.fasta.gz -o /panfs/roc/scratch/zhan2142/falcon_test/4-quiver/000197F/cns-000197F.fastq.gz || echo quvier failed Failure: The Arrow algorithm requires a BAM file containing standard (non-CCS) reads. quvier failed

However, previously when I used a test version of (by courtesy of Laura Nolden), I didn't have the issue. (Actually, I had another non-related issue with quiver and bam, but manually fixed it.)

Anyway, I commented out the three lines as people mentioned above, and issue resolved.