davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
683 stars 186 forks source link

How to extract the single-copy core gene from each genome according to the orthofinder output. #296

Closed Yliub closed 4 years ago

Yliub commented 4 years ago

Hi, I was doing homologous analysis, I would like to extract the single-copy core gene from each genome using orthofinder output . How can I or some other tools realize it? thanks very much.

davidemms commented 4 years ago

Hi

I think when you say 'single-copy core genes' you mean genes that are present in every (?) species and are single copy in all species, is that right? In that case, the orthogroups containing these sequences will be in the file Orthogroups/Orthogroups_SingleCopyOrthologues.txt, the sequences will be in the directory Single_Copy_Orthologue_Sequences and the simple list of genes in each orthogroup are in Orthogroups/Orthogroups.tsv. See https://github.com/davidemms/OrthoFinder#what-orthofinder-provides for any further details.

Depending on how closely related your species are there may be very few sequences that meet these criteria. If this is the case, you can make up your own set of criteria and analyse the file Orthogroups/Orthogroups.GeneCount.tsv to discover which orthogroups meet these new criteria.

All the best David

Yliub commented 4 years ago

Dear David Emms, I think you might misunderstand what I means, let me briefly introduce my project. I was conducting a comparative genomic analysis of 65 bacterial strains, and I would like to deduce the evolutionary relationships between those 65 bacterial strains on the basis of concatenated 100 single-copy core gene sequence. In this progress, each bacterial strain contains its own 100 single-copy core gene sequence should be extracted and concatenated. I konw the robust OrthoFinder softwore clarify each orhtogroup which contained in each strain. Since I was poor in programming language, such as python and perl. So, it has confused me for a while, could you give me some suggestions about this?

With Best Regards

Bang Liu

------------------ 原始邮件 ------------------ 发件人: "David Emms"notifications@github.com; 发送时间: 2019年10月28日(星期一) 下午5:02 收件人: "davidemms/OrthoFinder"OrthoFinder@noreply.github.com; 抄送: "liub"247052481@qq.com;"Author"author@noreply.github.com; 主题: Re: [davidemms/OrthoFinder] How to extract the single-copy core genefrom each genome according to the orthofinder output. (#296)

Closed #296. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.