IEDB / TCRMatch

Other
26 stars 12 forks source link

Possible to have TCRMatch ignore invalid CDR3s? #52

Closed bcorrie closed 11 months ago

bcorrie commented 12 months ago

Is there any way to have TCRMatch skip CDR3s that have invalid AAs? We have data that has been annotated with MiXCR and it uses to indicate a stop codon (e.g. ACVPGQGYNEQFF).

When TCRMatch processes this, it says:

Invalid amino acid found in *ACVPGQGYNEQFF at position 1

Unfortunately when it finds this case, it stops processing. So I have a 500K annotations in a Repertoire, and if any of them have a stop codon the whole data set can't be processed. Although I can filter out this case, it seem like it would be a nice feature to have be able to tell TCRMatch to skip these CDR3s and continue processing?

raphaeltrevizani commented 11 months ago

Thank you for the input. I updated the code to skip the lines with invalid characters and print an error message.

bcorrie commented 11 months ago

Awesome, thanks. That was fast! 8-)