DessimozLab / read2tree

a tool for inferring species tree from sequencing reads
MIT License
142 stars 18 forks source link

Issue with generating files #26

Closed huangjz2 closed 1 year ago

huangjz2 commented 1 year ago

Hi, thank you very much for providing such a convenient new tool, Read2Tree. I would be glad if you could answer my question.

On OMA, I selected three species as the reference, one of which is a subspecies of the middle one. However, in the final step, an error message told me that the sc.txt file for a certain species compared with that subspecies was not found. Such a problem did not appear in the first run, but this time a new species was added (i.e. the species for which the file was not generated). In the specific information of the parameters you provided, I found that I can choose to change the mode to "cov", ignore, or delete a reference species. My question is, are these steps performed before the alignment? And can I select the cov file instead of the sc file during the tree generation process? If this is not possible, do I need to select a different reference and start over again?

These are the questions I wanted to ask. Thank you very much!

sinamajidian commented 1 year ago

Dear @huangjz2

Thanks for using read2tree :) I'm afraid I didn't get your description on "the middle subspecies". Would be great if you could elaborate on this.

I would suggest to select many more species on oma browser including a few outgroup species to have a comprehensive reference set. It would be great if you could give us the list of species and the full error. I'm not sure whether removing a species out of three would help. Another point is that read2tree uses files from previous run. This might make it much more difficult to debug further steps. So I would suggest to start over for a new analysis (when the dataset is not huge).

Best regards, Sina

huangjz2 commented 1 year ago

Dear @sinamajidian Thank you very much for your reply! Here I am building a tree for some chicken varieties, so I chose gallus (chick) and ANAPP (a type of duck) on OMA, and also included a subspecies of duck (ANAPL). According to the description of the output files, there should be two types of files: sc.txt and cov.txt. However, in my output, some only have one cov.txt file. Therefore, the error message shows that sample1_sc.txt could not be found.

Regarding your point about the files before running read2tree, if it is similar to the case I mentioned where it stops halfway, do I need to start over again? Because I found that re-executing the command still runs.

I'm very sorry to take up some of your time.

Best regards, Huang

sinamajidian commented 1 year ago

My pleasure! It depends. When there are some issues/errors in the run, I would suggest to start over. Thanks for clarification on the species names. To find why this happens, it would be great if you share with us the full mplog.log file in addition to the output/errors when you ran the command line.

huangjz2 commented 1 year ago

Thank you again for your response. As there aren't too many species involved, I have decided to re-run the program. Wishing you all the best in your research.