snijderlab / stitch

Template-based assembly of proteomics short reads for de novo antibody sequencing and repertoire profiling
MIT License
22 stars 3 forks source link

Create a set of templates based on sequence fragments #190

Open douweschulte opened 2 years ago

douweschulte commented 2 years ago

So one for FR1, CDR1, FR2, etc.

The hope is that running it in this way would result in less 'split' reads. Because now if the program is ran with a sequence that is more similar to one germline in one part of the sequence while being more similar in another part to another germline the read populations are split between the two germline.

Germ1: ▆▆▆▆▆▅▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
Germ2: ▁▁▁▁▁▁▂▃▅▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆

Example depth of coverage of split population

douweschulte commented 1 year ago

One big problem with this set up is that pieces of different germlines could be chosen. While this does not mean anything in particular for the retrieval of the sequence (a region of an Ab could have mutated to look like a different germline) it makes it harder for casual users to grasp the meaning of the picked germline in the final result.