Closed tarini92 closed 5 years ago
From what you're describing, it sounds like you're running things correctly! Normally this message is returned when the variants input to neoepiscope don't lead to any amino acid changes, or lead to amino acid changes that generate peptide kmers identical to other portions of the normal protein. Is it possible that this is the case with your test VCFs? If you're sure that the variants should be producing amino acid changes and you're able/willing to share your VCFs with me, I'd be happy to run some tests and see if I can diagnose the problem.
After re-reviewing this issue, I've realized that you were using incompatible genome builds between the bowtie index and GTF-derived pickled dictionary files. The VCFs were generated using hg19 - could you try using ~/neoepiscope.data/hg19
for the -x
argument and ~/neoepiscope.data/gencode_v19/
for the -d
argument for the first example command you listed above and let me know if that changes anything? For me, using those two worked to generate neoepitope predictions.
Hi, As you correctly pointed out, it was the incompatibility in the reference genome dicts and indices files. It runs properly, now. Found 3 bugs earlier on, when I was running the code, line numbers 431, 459 and 466 in init.py.
431, the variable intervals_dict is not defined yet, it's defined in the if cause, which goes untraversed, if we're in the else clause. Locally, I changed it to intervals_path. 459, join([args.bowtie_index, ".", str(x), ".ebwt"], it would be a syntactical error if x is an integer, and not made a string.
On Tue, Mar 26, 2019 at 10:11 PM Mary Wood notifications@github.com wrote:
After re-reviewing this issue, I've realized that you were using incompatible genome builds between the bowtie index and GTF-derived pickled dictionary files. The VCFs were generated using hg19 - could you try using ~/neoepiscope.data/hg19 for the -x argument and ~/neoepiscope.data/gencode_v19/ for the -d argument for the first example command you listed above and let me know if that changes anything? For me, using those two worked to generate neoepitope predictions.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pdxgx/neoepiscope/issues/5#issuecomment-476737987, or mute the thread https://github.com/notifications/unsubscribe-auth/ATpQkw_eF_4LPkxIEvhXzQu7J0HYya8_ks5vak2ugaJpZM4cGQFt .
Results state the paired normal epitope to be NA. For all the listed neoepitopes in the result. Could it because of the reference genome different than expected? I am taking the hg19 reference genome with the test VCF's and Haplotype output.
Neoepitope Chromsome Pos Ref Alt Mutation_type VAF Paired_normal_epitope Warnings Transcript_ID mhcflurry1_HLA-C03:03_affinity mhcflurry1_HLA-C03:03_rank
CGGSKGDCGSW chr11 71276863 GT * D NA NA NA ENST00000398531.1 34614.31465996793 62.87025000000001
CGSWGLQR chr11 71276863 GT * D NA NA NA ENST00000398531.1 34571.5601350282 59.95950000000001
CGSWGLQRG chr11 71276863 GT * D NA NA NA ENST00000398531.1 31282.89680636645 39.07300000000001
CGSWGLQRGL chr11 71276863 GT * D NA NA NA ENST00000398531.1 28758.67352940209 27.539125
CGSWGLQRGLW chr11 71276863 GT * D NA NA NA ENST00000398531.1 36642.92804815618 78.78237500000002
DCGSWGLQ chr11 71276863 GT * D NA NA NA ENST00000398531.1 37518.02716488922 85.37950000000002
Thanks for pointing out that error, I will put out a new minor release addressing this today! Regarding the paired normal epitopes being 'NA' for the epitopes you listed, we currently only support paired normal epitopes for neoepitopes derived from SNVs, so any neoepitope derived from an insertion or deletion will not have a paired normal epitope reported.
Ah, yes. That makes sense. Though, I did not specifically state the flags for indels in case of preping the HAPCUT output. I thought the default is set for SNV's. In any case, I will inspect further. Thanks.
Ah, that's an interesting point that we hadn't considered! Currently the prep
mode of the software adds all unphased variants in the VCF in as their own haplotypes, but it would be nice to have a command line option to exclude indels during prep
for users that only want to focus on SNVs. Something for us to add in an upcoming release, thanks for bringing this to our attention!
Above are various trials of different haplotypes assembly outputs, that I've ran with the same output of no neoepitopes. It's possible I might be missing a step, so I'll run through the steps:
Now, the assumption is this case is not manually running the Haplotype phasing and taking the already given sample outputs. Where am I be going wrong?