FOI-Bioinformatics / nanometa_live

A streamlined workflow and GUI for real-time species identification and pathogen characterization via nanopore sequencing data. Engineered for precision, speed, and user-friendliness, with offline functionality post-initialization.
GNU General Public License v3.0
14 stars 2 forks source link

Extend Functionality to Handle Local Files Based on Tax IDs #50

Closed druvus closed 11 months ago

druvus commented 11 months ago

Summary:

This PR aims to extend the functionality of the existing process_local_files utility function to handle genomic fasta files based on either species names or tax IDs. The modifications make the function more flexible and robust, paving the way for a more streamlined data processing pipeline.


Changes:

  1. file_utils.py:

    • Added a new utility function process_local_files that takes an id_type parameter to distinguish between species and tax IDs.
    • Updated logging messages for better tracking and debugging.
  2. nanometa_prepare.py:

    • Imported process_local_files from file_utils.
    • Updated the main() function to call process_local_files for both species and tax ID-based local files.
    • Removed the species_to_taxid dictionary creation as it was redundant given the species_taxid_dict.
    • Updated error handling and logging for more clarity.