griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
137 stars 59 forks source link

pVACseq fails to generate predicted epitopes when using NetMHC #741

Closed yzhan360 closed 2 years ago

yzhan360 commented 2 years ago

Hi, I am testing pVACseq workflows and first of all thanks for your great work! I have successfully generated all_epitopes.tsv and the filtered.tsv using mhcflurry. But when using NetMHC, I only got .tsv_, .tsv and .fasta. There is not a list of all predicted epitopes. I checked the log files. When using NetMHC, the workflow have parsed prediction files but did not combine Parsed Prediction Files. No error messages showed up. Could you provide some idea on what happened or what else I should double check? Looking forward to your thoughts.

Thanks, Yan

susannasiebert commented 2 years ago

@yzhan360 I'm sorry you're encountering problems with getting a successful pVACseq run. Can you please post the stdout from your run?

Also, if you could provide additional information about your compute environment, that would be helpful. Do you have a standalone installation of pVACseq or are you using the docker container? Do you run it on a compute cluster or an a local machine?

yzhan360 commented 2 years ago

Thanks for your quick response! Here is my stdout when using NetMHC NetMHC_run2.log here is the stdout when using MHCflurry MHCflurry_run2.log and here is the computer environment nghead_env.txt I installed pVACseq using the docker container: Docker version 19.03.15, build 99e3ed8

susannasiebert commented 2 years ago

Hm, I don't see anything obvious here. Usually these sort of problems occur because the system is running out of resources (e.g. memory, hitting a max processes limit). I will need more information to debug this error:

Please note that today is my last work day before I leave on vacation until the end of the year so I won't be able to further assist with this problem until I return. Some things you might want to try in the meantime:

susannasiebert commented 2 years ago

@yzhan360 Have you had a chance to retry your pVACseq run with any of the recommendations above?

yzhan360 commented 2 years ago

Hi, Susanna! Happy New Year! Thanks for following up. I have tried running without -t and reducing the --downstream-sequence-length to 100. It works! I got the all_epitopes and filterd.tsv! Thanks so much! My question is that what's the difference between --downstream-sequence-length 100 and 1000. Do I get a smaller amount of prediction when I use 100?

susannasiebert commented 2 years ago

Correct. With a length of 100 you basically only get the first 100 amino acids after the frameshift mutation. For most frameshift mutations that isn't a problem as the tail is much shorter but some mutations can get quite long depending on when the frameshifted sequence codes for a stop. Reducing the downstream sequence length will leave out any potential neoepitopes after the cutoff. This is pretty rare though.

yzhan360 commented 2 years ago

Thanks a lot for your prompt response!

susannasiebert commented 2 years ago

You're welcome. I'm resolving this issue but do feel free to reopen it or make a new one if you run into any additional trouble.