DessimozLab / read2tree

a tool for inferring species tree from sequencing reads
MIT License
142 stars 18 forks source link

marker genes for single species mode #5

Closed GruA closed 1 year ago

GruA commented 2 years ago

Hello,

Thank you for creating such a nice tool!

Could you provide more details about single species mode, particularly how to generate marker genes list for it?

Also, what do you think about applying read2tree to single cell data (whole genome) from one individual? Might it be still suitable in your opinion?

Best, Gru

sinamajidian commented 2 years ago

Dear Gru Thank you for your interest in read2tree.

Read2tree is designed to infer species tree using DNA/RNA sequencing reads. I guess you have read our readme where "Single species mode" is discussed. Here by "single" we meant that we have sequencing data of an unknown species and we want to place it in the tree of life. Then the output of read2tree will be

          /-A
     /---|
    |     \-unknown_species
----|
    |     /-C
     \---|
          \-B

However in the "Multiple species mode", the goal is to infer few species.

          /-A
     /---|
    |     \-unknown_species1
----|
    |     /-unknown_species2
     \---|
          \-B

Here, species tree of A/B/C is considered to be known provided by marker genes, i.e. orthologous groups. In both cases you can obtain marker genes from OMA browser based on instruction in our wiki page. You could also provide read2tree with your own set of orthologous groups in FASTA fromat.

To answer your question, I should say that the current implementation of read2tree can not handle single-cell data when it comes to mosaic / somatic human mutations. However, there is a possibility of extending the method to do so in future.

I would be happy to discuss further, if I misunderstood your question.

Best regards, Sina