FOI-Bioinformatics / nanometa_live

A streamlined workflow and GUI for real-time species identification and pathogen characterization via nanopore sequencing data. Engineered for precision, speed, and user-friendliness, with offline functionality post-initialization.
GNU General Public License v3.0
14 stars 2 forks source link

Enhancements in nanometa-prepare for Improved Usability and Flexibility #49

Closed druvus closed 11 months ago

druvus commented 11 months ago

This PR introduces several key enhancements and fixes to improve the functionality and flexibility of the nanometa_live package. The primary focus is on the introduction of different operational modes for handling genome files and more flexible data types for updating YAML configuration files.

Changes

Further Details

  1. Flexible YAML Configuration Update: The function update_yaml_config_with_taxid now supports updating the YAML configuration file using either a DataFrame or a dictionary, making it easier to use in different contexts.

  2. Species Filtering in Kraken2 Parsing: The function parse_kraken2_inspect now accepts an optional species_list argument to filter the output, allowing for more targeted data retrieval.

  3. Operational Modes: A new command-line argument --mode has been added to nanometa_prepare.py to allow the user to specify the operational mode for handling genome files. Currently supported mode is 'gtdb-api', with placeholders for future modes like 'gtdb-file', 'local-species', and 'local-taxid'.

  4. Code Modularization: nanometa_prepare.py has been refactored for improved readability and maintainability. This will make it easier to add new features and modes in the future.

Impact

These changes improve the package's flexibility and make it more robust, setting the foundation for future features and improvements.