Closed HegedusB closed 5 years ago
Just simply replace the sequence in the fasta entry "cDNA|1" with the teloprime cap adapter sequence.
Thank for the advice. Now the program is running but do not do what I think it should do because it filters out the seemingly correct forward reads. Did I make some mistakes?
I do not know much about TeloPrime and your particular library construction protocol, but based on the information in the kit manual the primers should be:
>cDNA|1
TGGATTGATATGTAATACGACTCACTATAG
>cDNA|2
AAAAAAAAAAAAAAAAAACGCCTGAGA
A possible issue with this is that the RT primer only has 9 non-homopolymeric bases. If you added/ligated some extra sequence unique this primer then you could add those. And of course there can be false negatives, so some of the legitimate forward reads (as determined by alignment for example) might be discarded. They should not be in majority though.
Let me know how it goes! Botond
Thanks a lot! It seems, it is working now. I made a big mistake with the cDNA|2 primer. I left it unchanged. Can you give me an advice with the “cdna_classifier.py” program –g argument. I saw that lot of transcript was not recognized as full length because there were some mismatches in the adapter sequence. Should I optimize these argument or just leave unchanged?
First, you can enable the "heuristic mode" using -x. Then you could also try decreasing stringency to lower values using -l. Beware that decreasing -l will increase the number of false positives.
Is it possible to add additional adapters to the adapter list (cdna_barcodes.fas)? More precisely, I would like to use Lexogen teloprime kit cap adapter to identify the 5’ end of the sequence. I tried to add to the list but the program crashed with this new adapter list. Grateful for any help!