hammerlab / topeology

Compare neoepitope sequences with epitopes from IEDB
3 stars 1 forks source link

Restrict query by species, MHC #3

Open JPFinnigan opened 8 years ago

JPFinnigan commented 8 years ago

Would be neat if, in addition to epitope length, one had the option to restrict comparisons to a given species and also by MHC allele/allele-family within a given species.

tavinathanson commented 8 years ago

Good call. One issue with that is IEDB's inconsistent naming of organisms, though I imagine something like --species-string-contains "herpes" would still be useful.

JPFinnigan commented 8 years ago

Well, I was actually think along slightly different lines. Say, for example you have some epitope of interest, A, that you know to be 9 a.a long and A*02:01 restricted. You think that A might be a novel antigen, but before investing in a wet-lab validation you want to use topeology to ask whether the sequence of A is similar to any entry in IEDB. You would be more inclined to perform the validation of A if topeology suggests that A is similar to a known antigen. However, you might be inclined to move on if A has either been tested and has been found to be non-antigenic, or if A is similar to an entry in IEDB which has been found to be non-antigenic.

In that instance, what I think one is really asking is not whether A is similar to any 9 mer sequence in IEDB, but rather, whether A is similar to the other 9mer (+/- 8,10,11,...) A_02:01 or A_02:XX restricted entry in IEDB. Topeology doesn't inform your decision-making if it tells you that A is similar to a 9 mer H-2K(b) restricted entry in IEDB. The MHC context is important.

Just my $0.02

tavinathanson commented 8 years ago

Gotcha. That's super helpful in thinking about the use cases for topeology. I've been mostly thinking about the multiple peptide case, but I see how it could be useful for just a single peptide vs. IEDB.

I'm currently filtering by T-cell positive entries; sounds like that should be optional/configurable.

I fully agree that the MHC context is important. One issue is that IEDB is heavily biased toward specific alleles. Perhaps the set of options could be (a) no MHC restriction, (b) MHC restriction using the IEDB allele, (c) MHC restriction by running MHC binding on the allele in question (your peptide is similar to a T cell positive peptide in IEDB, and that T cell positive peptide is a predicted binder to the relevant allele)?

JPFinnigan commented 8 years ago

Totally agree with everything you proposed. A tool like this would be super useful.