The Amissense Tool analyzes and visualizes AlphaMissense pathogenicity scores, integrating AlphaFold structures and ClinVar data. It offers automated pipelines, visualizations, and versatile command-line utilities.
MIT License
0
stars
0
forks
source link
Automate UniProt ID Retrieval, Improve Error Handling, Add Retry Mechanism, and Refactor pipeline.py #34
This pull request addresses multiple issues and introduces several improvements to the codebase:
Automate UniProt ID Retrieval: The CLI has been modified to make the UniProt ID optional. The get_uniprot_id function has been integrated to automatically retrieve the UniProt ID based on the gene name and organism ID. The -u/--uniprot-id option has been deprecated, making the gene name the primary identifier. A fallback option has been added to manually specify the UniProt ID in case of retrieval failure.
Improve Error Handling and Configurability: Error handling has been added for missing or malformed config.json files, ensuring the program exits gracefully with a user-friendly message. The CLI now supports a custom config path via the --config-path argument, allowing users to specify the configuration file location. The UniProt ID retrieval process has been improved to handle errors more robustly.
Add Retry Mechanism for API Calls: A retry mechanism with exponential backoff has been implemented for all external API calls. The number of retries and the backoff strategy can be configured in config.json. This improves the reliability of the tool, especially when dealing with network issues and unstable connections.
Refactor pipeline.py: The logic in pipeline.py has been refactored to separate the orchestration of the pipeline from the actual function implementations. Specific processing functions have been moved to dedicated modules, improving maintainability and modularity.
This pull request also includes a version bump to 0.3.0 to reflect the new changes.
Description:
This pull request addresses multiple issues and introduces several improvements to the codebase:
Automate UniProt ID Retrieval: The CLI has been modified to make the UniProt ID optional. The
get_uniprot_id
function has been integrated to automatically retrieve the UniProt ID based on the gene name and organism ID. The-u/--uniprot-id
option has been deprecated, making the gene name the primary identifier. A fallback option has been added to manually specify the UniProt ID in case of retrieval failure.Improve Error Handling and Configurability: Error handling has been added for missing or malformed
config.json
files, ensuring the program exits gracefully with a user-friendly message. The CLI now supports a custom config path via the--config-path
argument, allowing users to specify the configuration file location. The UniProt ID retrieval process has been improved to handle errors more robustly.Add Retry Mechanism for API Calls: A retry mechanism with exponential backoff has been implemented for all external API calls. The number of retries and the backoff strategy can be configured in
config.json
. This improves the reliability of the tool, especially when dealing with network issues and unstable connections.Refactor
pipeline.py
: The logic inpipeline.py
has been refactored to separate the orchestration of the pipeline from the actual function implementations. Specific processing functions have been moved to dedicated modules, improving maintainability and modularity.This pull request also includes a version bump to 0.3.0 to reflect the new changes.
Closes #25, #15, #6, #26, #27.