A streamlined workflow and GUI for real-time species identification and pathogen characterization via nanopore sequencing data. Engineered for precision, speed, and user-friendliness, with offline functionality post-initialization.
GNU General Public License v3.0
15
stars
2
forks
source link
Enhance Nanopore Simulator: Improve Filename Handling and Error Management #71
This pull request introduces several key enhancements to the nanopore_simulator.py script:
Filename Handling:
Resolved Duplicate Extensions: Fixed the issue of output filenames having duplicate .fastq extensions (e.g., demo.fastq.fastq.gz) by implementing a get_base_name function to extract the correct base name.
Prefix Customization: Added a --prefix argument allowing users to specify custom prefixes for output filenames, enhancing flexibility and organization.
Delay Management:
Integer Delays: Modified delay intervals to use integer seconds instead of floating-point numbers, ensuring consistent and predictable wait times between operations.
Error Handling and Validation:
Enhanced Argument Validation: Implemented thorough checks for command-line arguments to ensure they meet required criteria (e.g., non-negative delays, positive read counts).
Robust Exception Handling: Added granular exception handling to provide informative error messages and prevent unexpected script crashes.
Code Refactoring:
Modular Functions: Refactored the script to improve modularity and readability, making it easier to maintain and extend.
Logging Enhancements: Updated logging configuration to include StreamHandler for real-time console output and added debug logs for better traceability.
Key Changes
Added a helper function get_base_name to accurately extract file base names without suffixes.
Updated copy_files_simulation and subsample_reads_simulation functions to incorporate prefix handling and prevent filename duplication.
Changed delay generation from floating-point to integer seconds using random.randint.
Enhanced command-line argument parsing with additional validations and the new --prefix option.
Improved logging setup for better visibility and debugging capabilities.
Description
This pull request introduces several key enhancements to the
nanopore_simulator.py
script:Filename Handling:
.fastq
extensions (e.g.,demo.fastq.fastq.gz
) by implementing aget_base_name
function to extract the correct base name.--prefix
argument allowing users to specify custom prefixes for output filenames, enhancing flexibility and organization.Delay Management:
Error Handling and Validation:
Code Refactoring:
StreamHandler
for real-time console output and added debug logs for better traceability.Key Changes
get_base_name
to accurately extract file base names without suffixes.copy_files_simulation
andsubsample_reads_simulation
functions to incorporate prefix handling and prevent filename duplication.random.randint
.--prefix
option.