hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
189 stars 59 forks source link

Purple doesnt run without structural variant input #217

Closed SebastianHollizeck closed 3 years ago

SebastianHollizeck commented 3 years ago

Hi,

i was testing purple on our data and wanted to try a few thinsg while waiting for gridss results to be completed.

when running purple with just

java -jar /dawson_genomics/Other/software/purple/purple_v3.1.jar -reference CA80 -tumor CA80-2 -amber ./amber -cobalt ./cobalt -ref_genome /data/reference/dawson_labs/genomes/GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna -gc_profile /dawson_genomics/Other/software/cobalt/GC_profile.1000bp.38.cnp -circos circos -output_dir .

i get the message

11:11:48 - [INFO ] - PURPLE version: 3.1
11:11:48 - [INFO ] - Reference Sample: CA80, Tumor Sample: CA80-2
11:11:48 - [INFO ] - Output Directory: ./
11:11:48 - [INFO ] - Using ref genome: V38
11:11:49 - [INFO ] - Reading GC Profiles from /dawson_genomics/Other/software/cobalt/GC_profile.1000bp.38.cnp
11:11:51 - [INFO ] - Processing sample(ref=CA80 tumor=CA80-2)
11:11:51 - [ERROR] - failed processing sample(CA80-2): org.apache.commons.cli.ParseException: missing structural_vcf or sample data directory
org.apache.commons.cli.ParseException: missing structural_vcf or sample data directory
    at com.hartwig.hmftools.purple.config.SampleDataFiles.getFilename(SampleDataFiles.java:79)
    at com.hartwig.hmftools.purple.config.SampleDataFiles.<init>(SampleDataFiles.java:63)
    at com.hartwig.hmftools.purple.PurpleApplication.processSample(PurpleApplication.java:196)
    at com.hartwig.hmftools.purple.PurpleApplication.run(PurpleApplication.java:150)
    at com.hartwig.hmftools.purple.PurpleApplication.main(PurpleApplication.java:503)
11:11:51 - [INFO ] - Complete

both of those inputs are flagged as optional on the help page.

Both 3.0 and 3.1 have this behaviour, but 2.54 seems to work without svs

Cheers, Sebastian

charlesshale commented 3 years ago

I've restored this functionality. In Purple 3.2 beta:

https://github.com/hartwigmedical/hmftools/releases/tag/purple-v3.2_beta

teng-gao commented 2 years ago

Hi,

Thanks for this amazing pipeline!

I ran into the below error when running Purple-v3.2-beta. My command:

java -jar ~/hartwig/purple.jar \
        -tumor_only \
        -tumor $sample \
        -output_dir ~/external/WASHU/$sample/purple \
        -amber ~/external/WASHU/$sample/amber \
        -cobalt ~/external/WASHU/$sample/cobalt \
        -gc_profile ~/hartwig/GC_profile.1000bp.38.cnp \
        -ref_genome ~/ref/hg38.fa

The error message:

22:52:50 - [INFO ] - PURPLE version: 3.2
22:52:50 - [INFO ] - Tumor Sample: 47491_Primary
22:52:50 - [INFO ] - Output Directory: /home/tenggao/external/WASHU/47491_Primary/purple/
22:52:50 - [INFO ] - Using ref genome: V37
22:52:51 - [INFO ] - Reading GC Profiles from /home/tenggao/hartwig/GC_profile.1000bp.38.cnp
22:52:53 - [INFO ] - Processing sample(ref=DIPLOID tumor=47491_Primary)
22:52:53 - [INFO ] - Reading amber QC from /home/tenggao/external/WASHU/47491_Primary/amber/47491_Primary.amber.qc
22:52:53 - [INFO ] - Reading amber bafs from /home/tenggao/external/WASHU/47491_Primary/amber/47491_Primary.amber.baf.tsv
22:52:54 - [INFO ] - Reading amber pcfs from /home/tenggao/external/WASHU/47491_Primary/amber/47491_Primary.amber.baf.pcf
22:52:54 - [INFO ] - Average amber tumor depth is 53 reads implying an ambiguous BAF of 0.561
22:52:54 - [INFO ] - Reading cobalt ratios from /home/tenggao/external/WASHU/47491_Primary/cobalt/47491_Primary.cobalt.ratio.tsv
22:52:57 - [INFO ] - Reading cobalt reference segments from /home/tenggao/external/WASHU/47491_Primary/cobalt/DIPLOID.cobalt.ratio.pcf
22:52:57 - [INFO ] - Reading cobalt tumor segments from /home/tenggao/external/WASHU/47491_Primary/cobalt/47491_Primary.cobalt.ratio.pcf
22:52:57 - [INFO ] - Somatic variants support disabled.
22:52:57 - [INFO ] - Sample gender is male
22:52:57 - [INFO ] - Applying segmentation
22:52:57 - [INFO ] - Merging reference and tumor ratio break points
22:53:00 - [INFO ] - Fitting purity
22:53:22 - [INFO ] - Sample maxDiploidProportion(0.567) diploidCandidates(93) purityRange(0.800 - 0.890) hasTumor(true)
22:53:22 - [INFO ] - Calculating copy number
22:53:23 - [INFO ] - Generating QC Stats
22:53:23 - [INFO ] - Modelling somatic peaks
22:53:23 - [ERROR] - failed processing sample(47491_Primary): htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: java.io.IOException: Is a directory, for input source: file:///home/tenggao/
htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: java.io.IOException: Is a directory, for input source: file:///home/tenggao/
        at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
        at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
        at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
        at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
        at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:81)
        at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:145)
        at htsjdk.variant.vcf.VCFFileReader.<init>(VCFFileReader.java:95)
        at com.hartwig.hmftools.purple.somatic.SomaticPeakStream.somaticPeakModel(SomaticPeakStream.java:61)
        at com.hartwig.hmftools.purple.PurpleApplication.performFit(PurpleApplication.java:315)
        at com.hartwig.hmftools.purple.PurpleApplication.processSample(PurpleApplication.java:223)
        at com.hartwig.hmftools.purple.PurpleApplication.run(PurpleApplication.java:156)
        at com.hartwig.hmftools.purple.PurpleApplication.main(PurpleApplication.java:563)
Caused by: htsjdk.samtools.util.RuntimeIOException: java.io.IOException: Is a directory
        at htsjdk.tribble.readers.SynchronousLineReader.readLine(SynchronousLineReader.java:53)
        at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:24)
        at htsjdk.tribble.readers.LineIteratorImpl.advance(LineIteratorImpl.java:11)

Note that besides this error, the gnome version is also not recognized (I passed hg38).

charlesshale commented 2 years ago

I've restored Purple's ability to run without a somatic VCF. The crash above is due to Purple assuming the file has been provided in config.

Also you'll need to specify the config "-ref_genome_version V38" - this is now in the example config in the read-me:

Beta JAR updated:

https://github.com/hartwigmedical/hmftools/releases/tag/purple-v3.2_beta