Open BiodivGenomic opened 4 years ago
Hi
What version of OrthoFinder were you using?
All the best David
Hi, I'm using the version 2.4.0. Thanks ! Damien
Hi Damien
By default the species tree will be generated using STAG and will have support values. There is a fallback method which is employed if the data is too limited to use this method. In this case a message is printed, "Using fallback species tree inference method". This occurs if there are fewer than 100 orthogroups with all species present. I wonder if that might be what occurred here?
All the best David
Hi David, indeed, it could be an explanation, as I have less only 34 of these groups.... Can you please describe a little bit this fallback species tree inference method ? Is there a way to get support with it ? Thanks in advance !
Details on the method are given here: https://github.com/davidemms/OrthoFinder#species-tree-inference
Typically this situation arises when OrthoFinder has been provided with incomplete data e.g. only a subset of the genes in a species rather than all genes. If you have the extra information then you should provide it and OrthoFinder will usually be able to find enough data to calculate support values. That said, I see you're using transcriptomes too so I realise that might be the cause of the incomplete data instead? How many input species do you have and how many genes are there in each?
If you are limited in this respect then you could try would be the "-M msa" option, which uses tree inference via multiple sequence alignments. By default this will use FastTree and will give you Shimodaira-Hasegawa support values. Additionally, if you wanted, you would also be able to take the species tree alignment produced here and run any other tree inference program on it (e.g. IQTREE with the option "-bb 1000") to get bootstrap support values.
All the best David
Hello, thanks, I will try that. However, I would also like to use STAG for the tree reconstruction, and so get above the threshold for the number of orthogroups with all species included... Is there a way to identify the species that decrease the most this value (and therefore the ones I would preferably remove to increase the number of orthogroups with all species included) ? The file with the count of orthogroups per species could be a start, but I don't think the number of orthogroups in a species and the number of orthogroups with all other species excluding this particular species is directly linked... maybe by crossing it with an other file ? Thanks in advance for your help !
Yes, you're right about using the orthogroups per species file. I think you can use a few excel formulae to get the answer:
All the best David
Details on the method are given here: https://github.com/davidemms/OrthoFinder#species-tree-inference
Typically this situation arises when OrthoFinder has been provided with incomplete data e.g. only a subset of the genes in a species rather than all genes. If you have the extra information then you should provide it and OrthoFinder will usually be able to find enough data to calculate support values. That said, I see you're using transcriptomes too so I realise that might be the cause of the incomplete data instead? How many input species do you have and how many genes are there in each?
If you are limited in this respect then you could try would be the "-M msa" option, which uses tree inference via multiple sequence alignments. By default this will use FastTree and will give you Shimodaira-Hasegawa support values. Additionally, if you wanted, you would also be able to take the species tree alignment produced here and run any other tree inference program on it (e.g. IQTREE with the option "-bb 1000") to get bootstrap support values.
All the best David
Hi David, I would also like to use the alignment result generated by Orthofinder to build a species tree with bootstrap value. Can it be achieved using Orthofinder or I need to do it additionally with IQTREE? I am a little bit confused by what file should be regarded as the alignment result file for IQTREE input. By the way, I am using orthofinder version 2.5.2. Thanks in advance!
Hi
I would recommend using IQTREE directly. There is a concatenated multiple sequence alignment file called MultipleSequenceAlignments/SpeciesTreeAlignment.fa
that you can use if you selected the -M msa
option with OrthoFinder.
Best wishes David
Hi
I would recommend using IQTREE directly. There is a concatenated multiple sequence alignment file called
MultipleSequenceAlignments/SpeciesTreeAlignment.fa
that you can use if you selected the-M msa
option with OrthoFinder.Best wishes David
Hi David, Thank you for your suggestion. I have tried and it works for me.
Best, Yichun
Hello, I'm testing OrthoFinder with a bunch of assembled genomes/transcriptomes peptides sequences, and the species tree ended to lack support values. I used the "normal" command-line to run OrthoFinder ("orthofinder -f my_data_folder"), but is there an option to ensure supports will be computed ? As I already ran the entire pipeline, I would also like to know if there is a way to get the support values from what was already computed (took a long time to get orthologs and gene trees, so If I can save this time...) Thanks in advance !