MathOnco / NeoPredPipe

Neoantigens prediction pipeline for multi- or single-region vcf files using ANNOVAR and netMHCpan.
GNU Lesser General Public License v3.0
100 stars 28 forks source link

Conversion script for netMHCpan 4.1 output #41

Closed tucano closed 7 months ago

tucano commented 9 months ago

Hello!

I am testing the pipeline with netMHCpan version 4.1 and all is working as expected.

To convert the neoantigens output to the format required by NeoRecoPo.py should be sufficient to use this command:

cut --complement -f 24,25 OUTPUT.neoantigens.txt > OUTPUT.neoantigens.formatted.txt

Reference: https://academic.oup.com/view-large/figure/205007542/gkaa379fig1.jpg

Can you confirm?

Would be nice to add this info in the NeoPredViz.md file.

Ciao!

Davide Rambaldi

elakatos commented 9 months ago

Hi Davide,

Indeed, you essentially just end up with 2 extra columns so it mostly does the trick. The index of which columns are extra depends on a few things though: whether you have multi-region or single-region data, or an expression column - so might be good to make the column index dependent on the total number of columns. Plus after you cut away the extra columns, I believe the order will be different than in netMHCpan4.0 (Affinity is after Rank, instead of between Score and Rank), so the columns should also be re-shuffled. (I based this on running some dummy peptide with the BA predictions option in 4.0 and 4.1). If you have a nice one/few-liner for this task, I'll be happy to add that into the readmes! Otherwise, I plan to add support for 4.1 within the pipeline itself once my schedule is cleared up more (by end of the month).

Best, Eszter