Enhancement: Improve the stability of the annotation process by using genomic coordinates instead of HGVS. This will involve several steps, including using the region endpoint, converting VCF notation, and adding VCF detection. #14
Currently, the annotation process relies on the HGVS endpoint, which can be less stable and may not handle all variants effectively.
Proposed Solution
Use Genomic Coordinates for VEP Annotation:
Deprecate the current vepAnnotation function and rename it to vepHgvsAnnotation.
Implement a new vepRegionsAnnotation function that uses the region endpoint (https://rest.ensembl.org/vep/homo_sapiens/region/3:319781-319781:1/-) for VEP annotation.
Create a function to convert VCF notation (e.g., 1-65568-A-C) to the Ensembl default format (1 65568 65568 A/C 1).
Feed the computed Ensembl default format into the new vepRegionsAnnotation function.
Add VCF Detection Option:
Implement a function to detect the input format.
If VCF format is detected (e.g., 1-65568-A-C), skip the Variant Recoder step.
Directly transform the VCF notation into the Ensembl default format and use it in the vepRegionsAnnotation function.
Acceptance Criteria
[x] Rename the current vepAnnotation function to vepHgvsAnnotation.
[x] Implement the vepRegionsAnnotation function using the region endpoint.
[x] Create a function to convert VCF notation to the Ensembl default format.
[x] Implement VCF format detection.
[x] Skip the Variant Recoder step if VCF format is detected.
[x] Ensure the transformed VCF notation is fed into the vepRegionsAnnotation function.
[x] Update the documentation to describe the new functionality and options.
[ ] Add tests to verify the stability and correctness of the new annotation process.
Current Behavior
Currently, the annotation process relies on the HGVS endpoint, which can be less stable and may not handle all variants effectively.
Proposed Solution
Use Genomic Coordinates for VEP Annotation:
vepAnnotation
function and rename it tovepHgvsAnnotation
.vepRegionsAnnotation
function that uses the region endpoint (https://rest.ensembl.org/vep/homo_sapiens/region/3:319781-319781:1/-
) for VEP annotation.1-65568-A-C
) to the Ensembl default format (1 65568 65568 A/C 1
).vepRegionsAnnotation
function.Add VCF Detection Option:
1-65568-A-C
), skip the Variant Recoder step.vepRegionsAnnotation
function.Acceptance Criteria
vepAnnotation
function tovepHgvsAnnotation
.vepRegionsAnnotation
function using the region endpoint.vepRegionsAnnotation
function.