veg / hyphy-analyses

HyPhy standalone analyses
MIT License
37 stars 17 forks source link

Output are not writing on the results.json file - BUSTED #45

Closed swantan closed 10 months ago

swantan commented 10 months ago

Hi, I wanted to obtain dN/dS value for a particular region of the surface protein of a bacteria. I have 2000+ sequences (~50aa residues) and ran BUSTED locally on my computer (did codon-based msa alignment). Here is my command: hyphy busted --alignment ../codon_based_msa/codon_aligned_clsto.fasta --tree ../iqtree/codon_aligned_clsto.fasta.treefile --output results_clsto.json &

However, I encountered two issues where I do not see any output written on the specified result file and I do not know how to estimate the run time. I expected this is computational intensive and I put it to background run but the program seems running forever.

I have been searching for help information but no luck. It would be super helpful if you could guide me on these issues.

Thank you! Much appreciated!!

spond commented 10 months ago

Dear @swantan,

  1. The output file should go to the current working directory, but it will not be created until the busted run has made significant progress. Have you tried using find to see where it might be instead?

  2. For 2000+ sequences and a very short alignment (~50aa), busted may not be the best choice, as it will run for a long time and likely overfit the data. If you you are after is a dN/dS estimate I would suggest using FitMG94.bf (https://github.com/veg/hyphy-analyses/tree/master/FitMG94)

hyphy /path/to/FitMG94.bf -alignment ../codon_based_msa/codon_aligned_clsto.fasta --tree ../iqtree/codon_aligned_clsto.fasta.treefile --output results_clsto.json

You will get something like this (also written to the JSON file in the .['fits']['Standard MG94'] object )

image

Best, Sergei

swantan commented 10 months ago

Dear @spond,

I really appreciate your prompt response and helpful suggestions! Thank you!!

  1. For the busted output file, I am able to locate the file but it's empty.

  2. I tried running FitMG94.fb, however still no luck. I got the error log showing the below. Does that mean the alignment file does not match the treefile?

Error:'../codon_based_msa/cdc_all_emm_nt_codon_aligned_clusto.fasta'` is not a valid choice passed to 'Accept rooted trees if present' ChoiceList using redirected stdin input or keyword arguments. Valid choices are Yes, No in call to ChoiceList(selection, description, 1, NO_SKIP, option_set); '../codon_based_msa/cdc_all_emm_nt_codon_aligned_clusto.fasta' is not a valid choice passed to 'Accept rooted trees if present' ChoiceList using redirected stdin input or keyword arguments. Valid choices are Yes, No in call to ChoiceList(selection, description, 1, NO_SKIP, option_set);

Also, if you don't mind, may I ask if the choices of alignment tools matter (e.g. clustalo, mafft)?

Thank you again! Swan

swantan commented 10 months ago

Dear @spond, Sorry, I think I figured out the error. So the command has to be --alignment instead of -alignment as you previously shown.

Best, Swan