hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
193 stars 59 forks source link

classes com.hartwig.hmftools.strelka.mnv.ImmutableMNVMerger; com.hartwig.hmftools.strelka.mnv.ImmutableMNVValidator are not in the source codes #18

Closed LeaveYeah closed 5 years ago

LeaveYeah commented 5 years ago

Hi, I am trying to use strelka-post-process in my pipeline. But I find the two classes imported in StrelkaPostProcessApplication.java, com.hartwig.hmftools.strelka.mnv.ImmutableMNVMerger; com.hartwig.hmftools.strelka.mnv.ImmutableMNVValidator; can not be found. Could you help resolve this problem? Thank you very much.

kduyvesteyn commented 5 years ago

These classes are auto-generated when you build using maven, so should be there.

You could also grab our latest official release from https://resources.hartwigmedicalfoundation.nl (folder HMFTools-Jars).

LeaveYeah commented 5 years ago

Many thanks. But it seems the file hmftools_pipeline_v4_3.zip exceeds the download limit of the website. I am not able to download it. Would you mind releasing the limit of size?

kduyvesteyn commented 5 years ago

This must be a limitation on your end. This works fine for me, plus we use the same system to transfer files up to 100s of GBs

LeaveYeah commented 5 years ago

Thank you very much! One more question, according to the methodology part of the paper Pan-cancer whole genome analyses of metastatic solid tumors , known pathogenic variants and PON data are used to preserve and filter the variants. But I see the package strelka-post-process jar file do not require input of these files. So are these steps included in the package? Thank you very much

kduyvesteyn commented 5 years ago

Strelka-post-process doesn't do all the post-processing. It only does filtering + MNV merging.

To replicate the methods from the Pan-cancer whole genome analyses paper, you need to do the following after you completed strelka run:

All the files that are referred to in these scripts can be downloaded from https://resources.hartwigmedicalfoundation.nl

LeaveYeah commented 5 years ago

That's very helpful! Also, I see the pan-cancer whole genome analysis paper uses strelka v1.0.14, is it OK I apply the post-processing scripts upon the outputs of strelka2?

kduyvesteyn commented 5 years ago

Most of the annotation steps (hotspots, mappability, PON) are fairly generic so assume they'd work on any VCF.

StrelkaPostProcess has some dependencies on the VCF itself, it expects the following fields:

If strelka2 has these fields, with the same meaning as strelka1, then it may work.

We never tested though!

LeaveYeah commented 5 years ago

Great! Thank you very much!