mossmatters / HybPiper

Recovering genes from targeted sequence capture data
GNU General Public License v3.0
108 stars 45 forks source link

Can I assemble chloroplast genes with Hybpiper? #145

Open Wyclif3 opened 4 months ago

Wyclif3 commented 4 months ago

I would like to compare both plastome and nuclear trees using the same set of data. However, assembling all the gene loci for both has been challenging especially for chloroplast loci. I generated a target file containing the chloroplast genes as part of the input together with read files but it seems hybpiper has no function for these. Any help will be highly appreciated.

mossmatters commented 4 months ago

Yes! HybPiper might throw some warnings depending on the target file (if you're trying to capture plastid intron regions that are not in frame). However these can be ignored and you will still get sequences corresponding to each plastid gene. There can be some reduced efficiency in capturing off-target plastid loci, especially when using the more recent kits from Arbor Biosciences.

Here's a soon-to-be-published example from some colleagues: https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1340056/abstract

For this paper I compiled a set of plastid amino acid targets from angiosperm OneKP data which can be downloaded from this repository: https://github.com/mossmatters/plastidTargets

I hope that helps with your issues, let me know if you have other questions!

Wyclif3 commented 4 months ago

Thank you. I downloaded the file and do I need to modify the [plastid_targets.faa] for best possible loci recovery? like for the nuclear target file mega353.fasta had very low gene recovery for all the samples and when I tried to download gene sequences for Asclepia syriaca http://sftp.kew.org/pub/paftol/current_release/fasta/by_recovery/oneKP.YADI.Asclepias_syriaca.a353.fasta to tailor my target file it did not download. I just want to use hybpiper to get plastome and nuclear sequences to assess the discordance between the two trees.