Closed ssnn-airr closed 2 years ago
Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).
You can skip the primer identification steps. These are largely for QC and isotype annotation. You can pull the UMI out of the sequences using MaskPrimers-extract by specifying the length (--len
) and start position (--start
) of the UMI. Eg:
MaskPrimers.py extract -s in.fastq --start 0 --len 15 --pf UMI -o out.fastq
Will put the first 15 bp in the field UMI
. Or:
MaskPrimers.py extract -s in.fastq --start 15 --len 25 --bf UMI --pr PRIMER --barcode -o out.fastq
Will put the first 15 bp in the UMI
field and the next 25 bp in the PRIMER
field - for dual barcodes setups or if you want to guess what their primers are.
For annotating the C-region without the official primers, you can just align against the C-region references:
https://presto.readthedocs.io/en/stable/examples/primers.html
More C-region substrings, based on the TakaraBio/ClonTech protocol, are compiled here:
https://bitbucket.org/kleinstein/immcantation/src/master/protocols/Universal/
Original report by Anton Kulaga (Bitbucket: [Anton Kulaga](https://bitbucket.org/Anton Kulaga), ).
We are using library preparation kits from https://irepertoire.com/ , unfortunately they rejected to give us their primer sequences, so we only know UMI. The problem is that pResto assumes that we know primers and makes primers parameters mandatory, how can we overcome the issue and mask UMI without masking primers (which we simply do not know)?