IEDB / arborist

1 stars 1 forks source link

Fix overlapping fragments #45

Closed dmx2 closed 1 month ago

dmx2 commented 1 month ago

There are various protein tree entries that having fragments overlapping in residues. This is because UniProt can include multiple segments of a protein that overlap and currently we extract everything in the API call. Leidos doesn't have a way to parse this, I don't think, since if an epitope is contained in the overlapping range, it will show both.

image

First, we should see how frequently the fragments we provide are overlapping (in counts and %), then come up with some rules on how to avoid this during the Arborist build.

dmx2 commented 1 month ago

Implemented:

Rules