Matteopaluh / KEMET

KEGG Module Evaluation Tool
Other
23 stars 5 forks source link

Error regarding output directory #21

Open timeresistance1996 opened 5 months ago

timeresistance1996 commented 5 months ago

Thank you for your code, but I encountered an issue when running it.

This is how I used it. Under the 'eggnog' directory, there is a file named 'emapper.annotations', and under the 'genome' directory, there is a file named 'genome.fna'. The code I used is python ./kemet.py -I ./eggnog -a eggnog --skip_hmm --skip_gsmm ./genome -q --log --path_output ./output

However, an error occurred: FileNotFoundError: [Errno 2] No such file or directory: './output/ktests/'

Strangely, when I manually created the entire folder, the code seemed to run smoothly, but no output file was generated.

I got the same error even when using your test files.

Matteopaluh commented 5 months ago

Hello!

I believe the error happens for two reasons: 1) due to the fact that a file named "genome.fna" requires a "genome.emapper.annotations" in the eggnog directory; 2) because the -O (or --path_output) argument requires an absolute path, as per lines 2406-2407 of kemet.py:

parser.add_argument('-O', '--path_output',
                    help='''Absolute path to ouput file(s) FOLDER.''', default = dir_base)

With those points in mind, based on the following suggested command line ./kemet.py [FASTA_file] -a [FORMAT] --hmm_mode [MODE] --gsmm_mode [MODE] (--skip_hmm) (etc.)

I'd change name of the annotation file, and inputing the full output folder path before running it for the file of interest: python ./kemet.py ./genome/genome.fna -a eggnog --skip_hmm --skip_gsmm -I ./eggnog -O ABSOLUTE/PATH/TO/output -q --log

Feel free to try it out with said adjustment and to come back to me if there's still issues. I'll try to put further work on simplifying I/O file operations!

Best, Matteo

shibormi commented 4 months ago

Hello I got this error, what might be the possible issues in here?

FileNotFoundError: [Errno 2] No such file or directory: 'KEGG_MODULES//kk_files/'

Matteopaluh commented 4 months ago

Hello I got this error, what might be the possible issues in here?

FileNotFoundError: [Errno 2] No such file or directory: 'KEGG_MODULES//kk_files/'

Hello,

I assume you executed kemet.py while being located to a different folder than KEMET, or you deleted KEGG_MODULES directory.

At the moment of writing (v1.0.0), the script execution is limited to execution from KEMET due to file dependencies to KEGG Modules diagram-like files (.kk files) - and they're stored in a path relative to the main script, at KEMET/KEGG_MODULES/kk_files.

Right now I/O operations on those files are coded assuming the main script is executed in the original directory, but a reasonable update could be to generalize its use. I'm sure open to suggestions, regarding this!

To answer and hopefully resolve your issue I would execute the script from the KEMET folder, as suggested in the Readme file. Let me know if this fix works, else I think you'd need a seperate issue (try looking in past closed issues first, in that case) 🙂

Best, Matteo

shibormi commented 4 months ago

actually i create a singularity images for this folder seems to be there in that singularity. So I have no idea above how this tools works.

Matteopaluh commented 4 months ago

actually i create a singularity images for this folder seems to be there in that singularity. So I have no idea above how this tools works.

I guess I'll need a little more info than what you provided then!

For instance it could be relevant if you shared what command you used to run kemet.py.

A high-level view of how the tool works is included in the manuscript (https://doi.org/10.1016/j.csbj.2022.03.015) and in this Github repository. The operative and brief description of that is that KEMET uses previously annotated functional annotations (stored in an input folder), organizes KEGG KO info in more structured tables (using files from KEGG_MODULES/kk_files) and makes some other downstream analyses from genomic/MAG sequences.

I'm lacking details regarding Singularity, though, so I'm not really sure how much it matters for your specific problem here.

shibormi commented 4 months ago

Here's are my commands below:

singularity exec KEMET.sif kemet.py inputs/AMR12223.fna -a kaas --skip_hmm --skip_gsmm -I Deliverables/inputs/ -O Deliverables/KEMET -q --log.

Hope this will sort it out the issues.

Matteopaluh commented 4 months ago

@shibormi I'm not totally familiar with singularity, as I mentioned earlier, nonetheless I'm trying to identify potential issues.

Similarly to the main issue that was raised here (by user: timeresistance1996) I assume it could be due to the -O (or --path_output) argument not using an absolute path, whereas it is actually required by kemet.py.

I'm not 100% sure that would be possible within the singularity program.

Secondly, the way you structured that command line within singularity points to both your AMR12223.fna and AMR12223.txt (or AMR12223.ko) files as both being located in the Deliverables/inputs/ folder. I don't know if said folder is materially on disk or if it exists "virtually".

Lastly, and connected to your first message in this issue, I'm not entirely sure singularity maintains the KEGG_MODULES folder, and in general how is the folder structure within the singularity program you created.

If you have suggestions in how to modify kemet source code to help with this issue, feel free to suggest, I'll be receptive!

Best, Matteo