Closed shellywanamaker closed 4 years ago
That link doesn't have FastA files, just her outputs from Primer3. I'll work on tracking down the original FastAs, but if you know where they are, please feel free to drop a link in here to save me some time. Thanks! I'll report back if/when I track them down.
Think I found what I needed (sequence names and link to FastA) here: https://github.com/RobertsLab/resources/issues/822#issuecomment-572313717
Great! Were they somewhere here: /home/sam/data/geoduck/transcriptomes/transdecoder_fasta_splits/ ?
No. I linked to the comment with their locations. Looks like she (and Steven) used a genes FastA file hosted in the OSF repo.
Gotcha. Thanks for tracking that down
Alrighty, re-ran the pipeline. Here's a summary table of primer set matches to any sequences in the genes FastA file.
More specifics on how this was run are in the Jupyter Notebook and my Notebook (linked at bottom of post).
Note: The number of matches should be divided by two. Reason is related to how I counted (using grep
).
SeqID | PrimerName | Matches |
---|---|---|
PGEN_.00g025890-vv0.74.a | TIF3s12 | 2 |
PGEN_.00g070040-vv0.74.a | APLP | 2 |
PGEN_.00g188130-vv0.74.a | FEN1 | 2 |
PGEN_.00g194630-vv0.74.a | ECHD3 | 2 |
PGEN_.00g338640-vv0.74.a | NSF | 2 |
PGEN_.00g288180-vv0.74.a | TIF3s4a | 4 |
PGEN_.00g245080-vv0.74.a | TIF3s10 | 8 |
PGEN_.00g132030-vv0.74.a | TIF3s8-1 | 10 |
PGEN_.00g079690-vv0.74.a | TIF3s7 | 14 |
PGEN_.00g088260-vv0.74.a | NFIP1 | 36 |
PGEN_.00g224740-vv0.74.a | GLYG | 46 |
PGEN_.00g280110-vv0.74.a | SPTN1 | 496 |
PGEN_.00g082590-vv0.74.a | TIF3s5 | 742 |
PGEN_.00g287540-vv0.74.a | RPL5 | 2570 |
PGEN_.00g132040-vv0.74.a | TIF3s8-2 | 7800 |
PGEN_.00g114060-vv0.74.a | GSK3B | 8596 |
PGEN_.00g000750-vv0.74.a | TIF3s6b | 15512 |
Jupyter Notebook:
Notebook:
@kubu4 awesome! Based on these results for the reproductive development primers, do you suggest going for just APLP since NFIP1 seems to have multiple targets? Do we know what these targets are?
do you suggest going for just APLP since NFIP1 seems to have multiple targets?
Yep.
Do we know what these targets are?
Technically, yes. Is that info readily available? Sort of. It would just require some leg work:
Look at the EMBOSS primersearch
output file to identify realistic, potential qPCR amplicons (e.g. < 300bp).
Use sequence ID from potential targets to search Panopea-generosa-genes-annotations.tab
Another thing to keep in mind is that I allowed up to 20% mismatch when checking the primers' specificity in silico. So, there's probably some wiggle room to tweak qPCR stuff (e.g. increase annealing temp, decrease [Mg2+] ) that would help increase primer annealing specificity in vitro. Is it worth the effort(s)? Probably not; unless you're really interested in that particular target.
Gotcha. I think we can just move ahead with APLP, and if we need to go back for to the drawing board for some reason we could revisit this for NFIP1.
For the expression control, would you say TIF3s8-1 is still the best candidate because the 4 other potential targets are > 6KB and won't amplify?
Yes, it's still the best candidate because the qPCR works well:
https://github.com/RobertsLab/resources/issues/970#issuecomment-665796790
@kubu4 can you please re-run the primer design pipeline (this time including EMBOSS) so that we have reproducible documentation of the software and settings used, and so we can check the specificity of the primers?
Fasta files are here: http://owl.fish.washington.edu/kaitlyn/202001-geoduck_reproductive_dev_primers/