Griffan / VerifyBamID

VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
http://griffan.github.io/VerifyBamID/
94 stars 15 forks source link

verifybamid2 doesn't recognize command line parameters correctly #18

Closed eastreen closed 4 years ago

eastreen commented 4 years ago

Hello!

I've tried to launch verifybamid2, as it'd been described in the tutorial:

$(VERIFY_BAM_ID_HOME)/bin/VerifyBamID \
  --SVDPrefix $(VERIFY_BAM_ID_HOME)/resource/1000g.100k.b38.vcf.gz.dat \
  --Reference [/path/to//GRCh38_full_analysis_set_plus_decoy_hla.fa] \
  --BamFile [/path/to/bam/or/cram/file] 

But I'm always getting this error:

verifybamid2 \
> --SVDPrefix /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat \
> --Reference /bucket/user/refs/GRCh38.genome.fa \
> --BamFile /bucket/user/bams/sample1/sample1.bqsr.bam
VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

 Copyright (c) 2009-2018 by Hyun Min Kang and Fan Zhang
 This project is licensed under the terms of the MIT license.

The following parameters are available.  Ones with "[]" are in effect:

Available Options
                    Input/Output Files : --BamFile [/bucket/user/bams/sample1/sample1.bqsr.bam],
                                         --PileupFile [Empty],
                                         --Reference [Empty],
                                         --SVDPrefix [/nfs/home/user/.conda/envs/gatk/share/verifybamid2-1.0.5-0/resource/--SVDPrefix./bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.--Reference.vcf.gz.dat],
                                         --Output [result]
               Model Selection Options : --WithinAncestry,
                                         --DisableSanityCheck, --NumPC [2],
                                         --FixPC [Empty],
                                         --FixAlpha [-1.0e+00],
                                         --KnownAF [Empty], --NumThread [4],
                                         --Seed [12345], --Epsilon [1.0e-08],
                                         --OutputPileup, --Verbose
   Construction of SVD Auxiliary Files : --RefVCF [Empty]
                    Deprecated Options : --UDPath [Empty], --MeanPath [Empty],
                                         --BedPath [Empty]

Cannot correspond command line parameter /bucket/user/refs/GRCh38.genome.fa (#3) to any of the options

FATAL ERROR - 
Problems encountered parsing command line:

Cannot correspond command line parameter /bucket/user/refs/GRCh38.genome.fa (#3) to any of the options

Exiting due to ERROR:
        Exception was thrown

I have .UD, .bed and .mu files with /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat prefix. However, for some reason, verifybamid2 wouldn't recognize the path I give to --SVDPrefix command line parameter; it tries to search for appropriate files in /nfs/home/user/.conda/envs/gatk/share/verifybamid2-1.0.5-0/resource/ directory and fails. The program also ignores the path I give to the --Reference parameter.

Am I doing something wrong here?

Griffan commented 4 years ago

Are you running this cmdline directly in shell or inside other pipeline that will further parse the the cmdline? Also, you can use "make test" in the build directory to check the installation status.

eastreen commented 4 years ago

I'm running the cmdline directly in the shell. Also, I don't have a build directory, since I've installed verifybamid2 in conda environment: conda install -c bioconda verifybamid2

Griffan commented 4 years ago

From the log content, it seems you provided --SVDPrefix argument twice, once with "--SVDPrefix [/nfs/home/user/.conda/envs/gatk/share/verifybamid2-1.0.5-0/resource"/, another with "--SVDPrefix /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat"

Could you please try to remove those "\" and ">", just use one line cmdline? Also let's try to echo ${VERIFY_BAM_ID_HOME}

eastreen commented 4 years ago

I submitted --SVDPrefix only once. Here's the one line command I've used just now, with no \'s:

verifybamid2 --SVDPrefix /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat --Reference /bucket/user/refs/GRCh38.genome.fa --BamFile /bucket/user/bams/sample1/sample1.bqsr.bam

I get the same error:

--SVDPrefix [/nfs/home/user/.conda/envs/gatk/share/verifybamid2-1.0.5-0/resource/--SVDPrefix./bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.--Reference.vcf.gz.dat]

For some reason the program doesn't recognize --SVDPrefix and --Reference as arguments, and automatically searches for the resource in the path /nfs/home/user/.conda/envs/gatk/share/verifybamid2-1.0.5-0/resource/--SVDPrefix./bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.--Reference.vcf.gz.dat, which is a nonsensical path.

I don't have ${VERIFY_BAM_ID_HOME} in the bash environment. I just copy-pasted this example from the tutorial to illustrate the point. Sorry for the confusion :)

eastreen commented 4 years ago

Also, I've noticed if I call verifybamid2 with no parameters whatsoever, it assigns --SVDPrefix automatically as /nfs/home/user/.conda/envs/verifybamid2/share/verifybamid2-1.0.6-1/resource/1000g.100k.b38.vcf.gz.dat

(verifybamid2) user@server:~$ verifybamid2
VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

 Version:1.0.6
 Copyright (c) 2009-2018 by Hyun Min Kang and Fan Zhang
 This project is licensed under the terms of the MIT license.

The following parameters are available.  Ones with "[]" are in effect:

Available Options
                    Input/Output Files : --BamFile [Empty],
                                         --PileupFile [Empty],
                                         --Reference [Empty],
                                         --SVDPrefix [/nfs/home/user/.conda/envs/verifybamid2/share/verifybamid2-1.0.6-1/resource/1000g.100k.b38.vcf.gz.dat],
                                         --Output [result]
               Model Selection Options : --WithinAncestry,
                                         --DisableSanityCheck, --NumPC [2],
                                         --FixPC [Empty],
                                         --FixAlpha [-1.0e+00],
                                         --KnownAF [Empty], --NumThread [4],
                                         --Seed [12345], --Epsilon [1.0e-08],
                                         --OutputPileup, --Verbose
   Construction of SVD Auxiliary Files : --RefVCF [Empty]
                    Deprecated Options : --UDPath [Empty], --MeanPath [Empty],
                                         --BedPath [Empty]

FATAL ERROR - 
--Reference is required

Exiting due to ERROR:
        Exception was thrown

And I can't override this default with my own file.

Griffan commented 4 years ago

Plus, your first post shows its version number is 1.0.5 and now it's 1.0.6, did you do anything between these two runs?

Griffan commented 4 years ago

Could you show me the way you install it? In the meantime, do you mind to directly compile your own version from github repo and give it a try?

eastreen commented 4 years ago

Plus, your first post shows its version number is 1.0.5 and now it's 1.0.6, did you do anything between these two runs?

I created a new conda environment and ran conda install -c bioconda verifybamid2 again. I have a bunch of other programs in the older conda environment, so I wanted to see, if there was some conflict with dependencies. Figures, it doesn't work even in the fresh env.

Could you show me the way you install it? In the meantime, do you mind to directly compile your own version from github repo and give it a try?

conda install -c bioconda verifybamid2

eastreen commented 4 years ago

I don't know, why two different versions (1.0.5 and 1.0.6) were installed with the same conda install command. It is weird :D

eastreen commented 4 years ago

I've cloned the git repo and tried to compile it:

git clone https://github.com/Griffan/VerifyBamID
cd VerifyBamID/
mkdir build
cd build
cmake ..

But I get an error:

user@server:~/VerifyBamID/build$ cmake ..
-- The C compiler identification is GNU 7.4.0
-- The CXX compiler identification is GNU 7.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:14 (message):
  libhts HTS_INCLUDE_DIRS not found

-- Configuring incomplete, errors occurred!
See also "/nfs/home/user/VerifyBamID/build/CMakeFiles/CMakeOutput.log".

It must have something to do with libhts dependency(?) :thinking:

I'm also attaching CMakeOutput.log

Griffan commented 4 years ago

Yes, iyou need to install htslib first and specify the location with https://github.com/Griffan/VerifyBamID#installation

btw, I have figured out that the original problem is in bioconda. I will let you know once I fix it.

Griffan commented 4 years ago

@eastreen the updated recipe in bioconda has been merged, the problem you encountered should be fixed now. Please let me know if the problem still exists.

eastreen commented 4 years ago

Hi!

Many thanks! I reinstalled the package in conda env and launched this command successfully:

verifybamid2 \
    --Verbose \
    --NumPC 4 \
    --Output /bucket/user/contamination/VBI2/sample1 \
    --BamFile /bucket/user/bams/sample1/sample1.bqsr.bam \
    --Reference /bucket/user/refs/GRCh38.genome.fa \
    --SVDPrefix /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat \
    --DisableSanityCheck

However, if instead of SVDPrefix I submit .UD, .bed and .mu files separately, the verifybamid2 fails:

verifybamid2 \
    --Verbose \
    --NumPC 4 \
    --Output /bucket/user/contamination/VBI2/sample1 \
    --BamFile /bucket/user/bams/sample1/sample1.bqsr.bam \
    --Reference /bucket/user/refs/GRCh38.genome.fa \
    --UDPath /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.UD \
    --MeanPath /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.mu \
    --BedPath /bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.bed \
    --DisableSanityCheck

VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.

 Version:1.0.6
 Copyright (c) 2009-2018 by Hyun Min Kang and Fan Zhang
 This project is licensed under the terms of the MIT license.

The following parameters are available.  Ones with "[]" are in effect:

Available Options
                    Input/Output Files : --BamFile [/bucket/user/bams/sample1/sample1.bqsr.bam],
                                         --PileupFile [Empty],
                                         --Reference [/bucket/user/refs/GRCh38.genome.fa],
                                         --SVDPrefix [/nfs/home/user/.conda/envs/verifybamid2/share/verifybamid2-1.0.6-2/resource/--Verbose.--NumPC.4.vcf.gz.dat],
                                         --Output [/bucket/user/contamination/VBI2/sample1]
               Model Selection Options : --WithinAncestry,
                                         --DisableSanityCheck [ON],
                                         --NumPC [2], --FixPC [Empty],
                                         --FixAlpha [-1.0e+00],
                                         --KnownAF [Empty], --NumThread [4],
                                         --Seed [12345], --Epsilon [1.0e-08],
                                         --OutputPileup, --Verbose
   Construction of SVD Auxiliary Files : --RefVCF [Empty]
                    Deprecated Options : --UDPath [/bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.UD],
                                         --MeanPath [/bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.mu],
                                         --BedPath [/bucket/user/refs/SureSelect_V7.subsetted_contamination.vcf.gz.dat.bed]

Initialize from FullLLKFunc(int dim, ContaminationEstimator* contPtr)
Open file:/nfs/home/user/.conda/envs/verifybamid2/share/verifybamid2-1.0.6-2/resource/--Verbose.--NumPC.4.vcf.gz.dat.bed    failed, exit!

It seems like --UDPath, --MeanPath and --BedPath arguments cannot override --SVDPrefix. It's not really a problem, the command with --SVDPrefix works just fine. But I thought you'd like to know about the issue anyway :)