Closed ebervilla closed 2 years ago
Hi there, Eber!
Thanks for the kind words :) Sorry it’s giving you trouble right now
I recently switched things such that GToTree downloads the specified SCG-set the first time it’s used and then stores it (to save a ton of space for those who don’t need all of them). It sounds like maybe something is not working there. But I think you forgot to attach the image (github needs a reminder like gmail, which saves me regularly from that, ha). Can you put that in or copy/paste the output text for me?
And also the output for the following:
GToTree -v
gtt-data-locations check
gtt-hmms
ls ${GToTree_HMM_dir}
Thanks!
Hi Mike
Yes, definitely it is necessary. Here I attached the outputs:
output_gtotree_alphaproteobacteria.txt
GToTree-v_gtt-data-locations_check.txt
Thank you very much in advance
Best
Thanks, Eber!
Either way, this will help me implement a check that will catch this before trying to run everything else, and give an actually useful error message...
I'm wondering if the file tried to download but didn't finish successfully, and now it's not trying to download it because it is being found. See what size it is with:
ls -lh ${GToTree_HMM_dir}Alphaproteobacteria.hmm
If it's fully there, it should be ~8.3M.
Either way, try removing that with:
rm ${GToTree_HMM_dir}Alphaproteobacteria.hmm
And then running your main GToTree command again where you're specifying -H Alphaproteobacteria
, and see if you get the same thing after you removed that file. I'm hoping it will download it again, and then you'll see this as the program starts, showing the expected number of targets:
HMM source to be used:
- Alphaproteobacteria.hmm (117 targets)
🤞
Hi mike,
It worked 👍
Thank you very much!!
``-------------------------------- RUN INFO ---------------------------------
Input genome sources include:
- NCBI accessions listed in Rhizobium_refseq_accessions.txt (38 genomes)
- Fasta files listed in fasta_files.txt (2 genomes)
Total input genomes: 40
HMM source to be used:
- Alphaproteobacteria.hmm (117 targets)
Options set:
- The output directory has been set to "Syn-GtoTree-out_3/"
- Taxonkit will be used to add NCBI taxonomy info to labels where possible
- Lineage information added to labels will be Species
- Number of jobs to run during parallelizable steps has been set to 4``
Great!
I'm going to leave this open until i implement a check to prevent this in the future
Thanks, Eber!
Hi Mike
I hope you are doing well
I have been trying to construct a phylogenomic tree of rhizobia strains in GToTree using Alphaproteobacteria genes; however, apparently, GtoTree does not have any genes (see the attached picture of my console). Do you know what could I do to be able to construct the tree using alphaproteobacteria genes?
Thank you very much for your help and for all the effort you put into your awesome tutorials
Best
Eber