Open fbattke opened 7 years ago
Sorry you are running into this issue, but try staging the WholeGenomeFasta directory with the following hierarchy of directories:
/reference/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa /reference/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fai
We have hard coded assumptions about the structure of the reference genome directory which really should be relaxed.
Thank you Eric,
that solved it. It would be helpful to mention this in the README.md file.
Florian
Von: Eric Roller notifications@github.com Gesendet: Montag, 20. November 2017 19:20 An: Illumina/canvas Cc: Florian Battke; Author Betreff: Re: [Illumina/canvas] NullReferenceException when running Germline-WGS (#69)
Sorry you are running into this issue, but try staging the WholeGenomeFasta directory with the following hierarchy of directories:
/reference/Homo_sapiens/NCBI/hg19/Sequence/WholeGenomeFasta/genome.fa /reference/Homo_sapiens/NCBI/hg19/Sequence/WholeGenomeFasta/genome.fai
We have hard coded assumptions about the structure of the reference genome directory which really should be relaxed.
- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/Illumina/canvas/issues/69#issuecomment-345782818, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACd913e2WSxbe-5gyhFkjIG7ZHsIMAH8ks5s4cLSgaJpZM4Qkn-8.
Hi, I am having the same problem in 1.38.0.1554 under Fedora, however this solution doesnt work for me.
2018-06-25T10:27:24+02:00,Running checkpoint 01: Validate input 2018-06-25T10:27:25+02:00,Running Canvas Germline-WGS 1.38.0.1554+master 2018-06-25T10:27:25+02:00,Command-line arguments: Germline-WGS --reference=/library/GENOMES/GRCh37_canvas/kmer.fa -g /library/GENOMES/GRCh37_canvas -f /library/GENOMES/GRCh37_canvas/filter13.bed --custom-parameters=CanvasBin,-m=TruncatedDynamicRange -b Tumor.dedup.recal.bam --sample-b-allele-vcf=Tumor_HaplotypeCallerPASS.vcf -n Tumor -o Tumor_CNV 2018-06-25T10:27:25+02:00,Checkpoint 01 Validate input complete. Elapsed time (hh/mm/ss): 00:00:00.2 2018-06-25T10:27:25+02:00,ERROR: Canvas workflow error: System.NullReferenceException: Object reference not set to an instance of an object. at Isas.SequencingFiles.ReferenceGenome.get_Species() at Isas.SequencingFiles.GenomeMetadata.Deserialize(TextReader reader, IDirectoryLocation genomeFastaFolder, IReferenceGenome referenceGenome) at Isas.SequencingFiles.GenomeMetadata.Deserialize(IFileLocation genomeSizeXml) at Canvas.GermlineWgsRunner.GetCallset() at Canvas.GermlineWgsRunner.Run(CanvasRunnerFactory runnerFactory) at Canvas.ModeLauncher.Launch()
This is the structure of my genome folder:
/library/GENOMES/GRCh37_canvas: dbsnp.vcf filter13.bed genome.fa genome.fa.fai GenomeSize.xml kmer.fa kmer.fa.fai /library/GENOMES/GRCh37_canvas/Homo_sapiens/NCBI/hg19/Sequence/WholeGenomeFasta: genome.fa genome.fa.fai GenomeSize.xml /library/GENOMES/GRCh37_canvas/Sequence/WholeGenomeFasta: genome.fa genome.fa.fai GenomeSize.xml /library/GENOMES/GRCh37_canvas/WholeGenomeFasta: genome.fa genome.fa.fai GenomeSize.xml
I use symbolic links in the additional subdirectories. The parameters should be ok since I can run the same analysis in version 1.11.0 without any problems. Do you have any suggestions? I will be very grateful for your help.
please use the full hierarchy of directories including species/provider/build:
Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta
This is what I am using, my full path is: /library/GENOMES/GRCh37_canvas/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta which contains: genome.fa genome.fa.fai GenomeSize.xml referenced as -g /library/GENOMES/GRCh37_canvas
In the previous post I pasted the wrong one with NCBI inside, instead of UCSC
try -g /library/GENOMES/GRCh37_canvas/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta
By the way, GRCh37 has different chromosome names than UCSC hg19 so that path is a little confusing
Thank you that did it. So to summarize -g /library/GENOMES/GRCh37_canvas/ didn’t work, unlike: -g /library/GENOMES/GRCh37_canvas/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta despite the fact that both of them contain the same files (the longer one contains symbolic links to the files in the short one).
You are right, the path is very confusing. It would be great if you could adress this issue in the next release.
It actually crashed at CanvasPartition due to: "Unhandled Exception: Illumina.Common.OptionException: Missing required value for option '-p'" which apears to be the optional ploidity parameter that I did not specify.
please try running Canvas on a fresh output directory. I think the ploidy option from your previous run is being cached.
Unfortunatly thats not it, I removed completly the output folder (Tumor_CNV), also, I never used the ploidity parameter. My entire command is:
canvas Germline-WGS --reference=/library/GENOMES/GRCh37_canvas/kmer.fa -g /library/GENOMES/GRCh37_canvas/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta -f /library/GENOMES/GRCh37_canvas/filter13.bed --custom-parameters=CanvasBin,-m=TruncatedDynamicRange -b Tumor.dedup.recal.bam --sample-b-allele-vcf=Tumor_HaplotypeCallerPASS.vcf -n Tumor -o Tumor_CNV
please try the SmallPedigree-WGS mode even for a single sample. Germline-WGS has been deprecated and will be removed in future versions. You will also need to specify the ploidy argument on the command line. See https://github.com/Illumina/canvas/issues/89 for details.
I am running Canvas in a docker container using the hg19 reference files from the S3 link.
I am trying to call Canvas using variants called by strelka/starling on a shallow WGS (7 fold) dataset:
root@711ba034ea7e:/reference# dotnet /opt/Canvas/Canvas.dll Germline-WGS -r /reference/kmer.fa -g /reference/ -f /reference/filter13.bed -b /data/test_cov7.bam -n testsample --sample-b-allele-vcf=/data/test_cov7.variants.vcf.gz -o /data/canvas.result
However, I get a NullReferenceException relating to the reference:
2017-11-20T17:26:39,Running checkpoint 01: Validate input 2017-11-20T17:26:40,Saved checkpoint results to /localcanvas.result/Checkpoints/progress.json 2017-11-20T17:26:40,Running Canvas Germline-WGS 1.30.0.725+master 2017-11-20T17:26:40,ERROR: Canvas workflow error: System.NullReferenceException: Object reference not set to an instance of an object. at Isas.SequencingFiles.ReferenceGenome.get_Build() at Isas.SequencingFiles.GenomeMetadata.Deserialize(TextReader reader, IDirectoryLocation genomeFastaFolder, IReferenceGenome referenceGenome) at Isas.SequencingFiles.GenomeMetadata.Deserialize(IFileLocation genomeSizeXml) at Canvas.GermlineWgsRunner.GetCallset() at Canvas.GermlineWgsRunner.Run(ILogger logger, ICheckpointRunner checkpointRunner, IWorkManager workManager, IFileLocation runtimeExecutable) at Canvas.ModeLauncher.Launch() 2017-11-20T17:26:40,Command-line arguments: Germline-WGS -r /reference/kmer.fa -g /reference/ -f /reference/filter13.bed -b /data/test_cov7.bam -n testsample --sample-b-allele-vcf=/data/test_cov7.variants.vcf.gz -o /data/canvas.result 2017-11-20T17:26:40,Saved checkpoint results to /local/canvas.result/Checkpoints/01-Validateinput.json 2017-11-20T17:26:40,Elapsed time (step/time(sec)/name) 01 00:00:00.6 Validate input 2017-11-20T17:26:40,Total execution time: 00:00:00.6
The reference folder is as follows: root@711ba034ea7e:/reference# ls -lh total 5.9G -rw-r--r-- 1 root root 4.6K Aug 28 16:03 GenomeSize.xml -rw-r--r-- 1 root root 11K Aug 28 15:48 filter13.bed -rw-r--r-- 1 root root 3.0G Aug 28 16:03 genome.fa -rw-r--r-- 1 root root 783 Aug 28 16:03 genome.fa.fai -rw-r--r-- 1 root root 3.0G Aug 28 15:48 kmer.fa -rw-r--r-- 1 root root 783 Aug 28 16:03 kmer.fa.fai
Here's the Dockerfile contents, in case you want to provide a docker image (I saw the request on another issue)
FROM ubuntu:16.04
dependency: mono
RUN apt-get update RUN apt-get -y install mono-runtime mono-complete wget curl apt-transport-https pigz
dependency: dotnet
RUN sh -c 'echo "deb [arch=amd64] https://apt-mo.trafficmanager.net/repos/dotnet-release/ xenial main" > /etc/apt/sources.list.d/dotnetdev.list' RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 417A0893 RUN apt-get update RUN apt-get install -y dotnet-dev-1.0.4
download canvas
RUN cd opt && wget https://github.com/Illumina/canvas/releases/download/1.30.0.725%2Bmaster/Canvas-1.30.0.725.master_x64.tar.gz RUN cd opt && tar xzvf Canvas-1.30.0.725.master_x64.tar.gz RUN ln -s /opt/Canvas-1.30.0.725+master_x64 /opt/Canvas
download hg19 ref files
RUN mkdir /reference RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/WholeGenomeFasta/GenomeSize.xml RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/filter13.bed RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/kmer.fa RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/kmer.fa.fai RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/WholeGenomeFasta/genome.fa RUN cd /reference && wget http://canvas-cnv-public.s3.amazonaws.com/hg19/WholeGenomeFasta/genome.fa.fai RUN apt-get clean RUN export DOTNET_CLI_TELEMETRY_OPTOUT=1
ENV PATH="/opt/Canvas/:$PATH" ENTRYPOINT ["dotnet","/opt/Canvas/Canvas.dll","Germline-WGS","-r","/reference/kmer.fa","-g","/reference/","-f","/reference/filter13.bed"]