oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
315 stars 70 forks source link

Concatinating CDS files for panEDTA #431

Open Isoris opened 4 months ago

Isoris commented 4 months ago

Hello,

Is it possible to use multiple CDS files from genbank and to combine them into a single file for panEDTA ?

for instance cat cds_species1.fa cds_species2.fa cds_species3.fa > cds.fa

When running panEDTA with multiple species, it should be fine right because it's basically a blast to whitelist CDS?

Thank you for your answer.

oushujun commented 4 months ago

Yes, this is possible, but keep in mind that when you aggregate things, both the good and the bad will be aggregated. If you have TE contaminations in your CDS files, they will add up and reduce your library sensitivity.

Shujun