jamiemcg / BUSCO_phylogenomics

BUSCO Phylogenomics | Utility script to construct species phylogenies using BUSCO proteins
MIT License
44 stars 7 forks source link

"*run_*" in directory name #12

Open jamiemcg opened 7 months ago

jamiemcg commented 7 months ago

Having "*run_*" in the BUSCO output directory name causes an error (e.g. BUSCO5_rerun_XYZ_output)

https://github.com/jamiemcg/BUSCO_phylogenomics/blob/0589f6ce883c3ebfa3b7521339d8caaf7f8bc526/count_buscos.py#L44-L49

(It looks for BUSCO sequences in the "logs" directory as it thinks it is the run directory, e.g. "run_alveolata_odb10")

Try something like this to fix:

         if isdir(i):
             for j in listdir(i):
                 k = join(i, j)
                 if isdir(k) and "run_" in j:
                     busco_samples.append(k)
                     busco_sample_names.append(basename(i)) 
aberaslop commented 7 months ago

Dear Jamie, I hope you are doing well! I am encountering an error in my run of BUSCO_phylogenomics (similar to #https://github.com/jamiemcg/BUSCO_phylogenomics/issues/9). I wonder if this is related to this issue you have just opened.

I have used busco v5.0, and each of my output folders had the busco results in a directory called run_sordariomycetes_odb10. To avoid confusion, I renamed such directory with the name of the proteome being analyzed. I then ran BUSCO_phylogenomics, and the program fails saying that there are directories missing. The reason is, as in #https://github.com/jamiemcg/BUSCO_phylogenomics/issues/9, I suspect, because it is looking for files inside every single directory.

This is the error I obtain: FileNotFoundError: [Errno 2] No such file or directory: '/scistor/guest/zrs382/data/fusarium/phylogenomics/20240207_BUSCO_output/run_GCA_003615085/logs/busco_sequences/single_copy_busco_sequences'

Is this what you are referencing in this issue? And can I solve it somehow? Any help is greatly appreciated.

Thank you so much for your help, and thank you again for this great program! Best, L.