NickJD / StORF-Reporter

Package to extract Unannotated Regions from prokaryotic genomes report coding and pseudogenised genes delimited by stop codons - Named StORFs (Stop - Open Reading Frames)
GNU General Public License v3.0
15 stars 1 forks source link

Multi-Genome GFF Enhancements #1

Closed elchaarn closed 6 months ago

elchaarn commented 9 months ago

This is a valuable tool for enhancing gene annotations. I do have one suggestion: Bakta provides its output in the form of individual directories for each genome, each containing GFF and other files. It might be more user-friendly if StORF-Reporter could directly utilize the Bakta output without requiring users to create a new directory specifically for the GFF files when supplementing multiple Prokka/Bakta outputs.

NickJD commented 7 months ago

Finally got round to doing this. Please checkout v1.3.0: https://github.com/NickJD/StORF-Reporter/releases/tag/v1.3.0

Now available on pip too - pip install StORF-Reporter --upgrade

elchaarn commented 6 months ago

Thanks for the update. I tried running this as "Multiple_Out_Dirs" and it did run, however the output fasta file is empty. The output gff file looks fine, and I get an output fasta file with the StORFs when using “-sout” command.

elchaarn commented 6 months ago

Additionally, when I try to run "Multiple_Out_Dirs", the code crashes if my input directory has any subdirectory other than the outputs from bakta/prokka. Can you modify so that if I have a directory, with no gff or fna, that directory is ignored and just noted as an wanrning/error? To clarify, I would like the tool to go into each directory and run IF it finds one matching pair of gff and fna files.

NickJD commented 6 months ago

These issues should now be fixed in v1.3.1: https://github.com/NickJD/StORF-Reporter/releases/tag/v1.3.1