Closed jon-xu closed 2 years ago
Hi Jon,
Thanks for using tailfindr.
I assume that you are using tailfindr to fond polyA/polyT lengths in cDNA. Can I ask you what protocol/kit are you using to generate this cDNA.
Best, Adnan
Hi Adnan,
You are right, I used a wrong configuration file for basecalling.
Will try again and let you know.
Cheers, Jon
Hi Adnan,
after applying correct configuration file, tailfindr works fine.
Thanks! Jon
Great! Thanks for the update.
Hi Adnan,
Sorry I was looking at a wrong result.
After using the correct configure file in basecalling, there still seems to be problem in the result: https://cloudstor.aarnet.edu.au/plus/s/u45OicPnx2ro2Og
And here is the sample fast5: https://cloudstor.aarnet.edu.au/plus/s/0oXhg7vO1tQnDRE
Thanks! Jon
some read has type "polyA" and some still invalid. And for the polyA ones, tail_is_valid is FALSE... we used SQK-PCB109 kit for the cDNA.
Hi Jon,
SQK-PCB109 is not suitable for doing polyA/polyT profiling. This is because the polyT primer can anneal anywhere in the polyA stretch of the RNA (see this figure) and therefore the estimated polyA/polyT would mostly be an underestimate of the true polyA tail length.
If you want correct estimates of polyA/polyT tails, then you have to use the SQK-PCS111 kit. This kit uses a special primer with overhang which ensures that the full polyA tail is amplified during creation of the cDNA. tailfindr only works with this kit because there is no other Nanopore kit that can successfully amplify the cDNA with full-length polyA/T tails.
Best, Adnan
Understand! Thanks Adnan!
But will tailfindr still include the length estimate if it is SQK-PCB109, even though it might not be accurate?
Cheers, Jon
Yes, it should work provided you specify the correct front and end primer sequences when calling tailfindr.
Please see section 5 of the tailfindr readme (5. Specifying custom cDNA primers). Just change the front (FP) and end primer (EP) sequences to whatever SQK-PCB109 uses and it would work albeit the predictions won't be correct.
Hi Adnan,
For PCR-cDNA Barcoding Kit (SQK-PCB109) The top and bottom strand of this primer carry different flanking sequences: 5' - ATCGCCTACCGTGAC - barcode - ACTTGCCTGTCGCTCTATCTTC - 3' 5' - ATCGCCTACCGTGAC - barcode - TTTCTGTTGGTGCTGATATTGC - 3'
Which one is the FP and which one is the EP? I have tried using "ATCGCCTACCGTGAC" as FP and "ACTTGCCTGTCGCTCTATCTTC" as EP, the result remains the same as not specifying them.
Thanks! Jon
Hi Jon,
I am not very familiar with this kit, but please refer to the diagram below for knowing the positions of the FP and EP:
So FP is the sequence that is located immediately to the right of the 5'-end of the mRNA-oriented strand. The EP is the sequence located immediately to the left of reverse-complement cDNA. FP and EP are just names; the important thing is the sequence information of these two entities for your experiment/kit.
I have highlighted the FP and EP positions in green the SQK-PCB109 protocol below.
Based on the information I have provided above, please find the correct sequences for FP and EP and then use those.
Adnan
Thanks Adnan! After using the correct FP/EP, we got some results. But about 20% of the reads were marked as FALSE for "tail_is_valid". Is it too much or normal according to your experience, please?
Yes such high number of invalid tail is normal because of two reasons:
Thank you very much!!
Hi Adnaniazi,
Could you please help to check whether I got invalid read type and FALSE tail_is_valid thus no tail length estimated? It's a cDNA sample.
The result file: https://cloudstor.aarnet.edu.au/plus/s/ml02t0AAm3MdSWO One of the fast5 files: https://cloudstor.aarnet.edu.au/plus/s/wgsMP0NQs4pfTix
Many thanks, Jon