Open Calvin2077 opened 1 year ago
Hey there, @Calvin2077!
Thanks for the kind words :)
A genome shouldn’t be dropped due to the redundancy estimate, that’s just a notice. Are you sure it’s not in the final tree? If not, it may be getting dropped for not enough target genes being found, which we can adjust
Hello AstrobioMike,
You're welcome, and thank you for getting back to me so fast it is much appreciated. And I checked my tree and I am indeed missing a species.
Moreover when I run the code "GToTree -f Untitled2.txt -o hope_new -H Archaea" it says it is only using 40 out of my 41 species despite my list (Untitled2.txt) containing all of them.
I don't know if it's related but the one that is missing is the last one on my list.
Hmm, strange. Any chance you’d be able to share the fasta files and the input Untitled.txt file with me at MikeLee\<at>bmsis.org so I can take a look? I’ll delete them right after testing of course
@Calvin2077 and i tracked down that the issue was the input file listing the paths to the genomes didn't have a line-return character at the end of the file, and the last one was being left off
i need to think about how to put in a check for this
Note for myself
I currently runn a dos2unix
/cmp
check on each input file, e.g.:
I can add the --add-eol
argument so they will auto-add an end-of-line to end of file if it's not there. That will address this. (Add it to the cmp
checks too, so it's still only run if needed)
Hello!
I recently discovered your GToTree and have found it super helpful for my master's project and your clear instillation and instructions have been a huge help in getting it too work on my laptop so thank you very much.
I did a practice run of my species and one was dropped due to the redundancy being greater than 10%. As I am needing to include all my species for my project if there a way to increase the threshold of redundancy when using amino acids fasta files?
Thanks