Update transcript version warnings to include genome build info for a…

Peter-J-Freeman commented 1 month ago

…lignments available

gitguardian[bot] commented 1 month ago

⚠️ GitGuardian has uncovered 5 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

| GitGuardian id | GitGuardian status | Secret | Commit | Filename | | | -------------- | ------------------ | ------------------------------ | ---------------- | --------------- | -------------------- | | [1081033](https://dashboard.gitguardian.com/workspace/140797/incidents/1081033?occurrence=133061721) | Triggered | Generic Password | d08f44762df3298b612b7b2010093dd1d8d2e289 | db_dockerfiles/vvta/Dockerfile | [View secret](https://github.com/openvar/variantValidator/commit/d08f44762df3298b612b7b2010093dd1d8d2e289#diff-f0cb160aab1a49a47ea0c65805f7c423e09908acecf2f671b31a7479bad01070R5) | | [1081033](https://dashboard.gitguardian.com/workspace/140797/incidents/1081033?occurrence=133061722) | Triggered | Generic Password | d08f44762df3298b612b7b2010093dd1d8d2e289 | configuration/docker.ini | [View secret](https://github.com/openvar/variantValidator/commit/d08f44762df3298b612b7b2010093dd1d8d2e289#diff-77f1f9e28c04e4b0c2e3af9f7228ad17387dc0348365643d29ff66811a344459L19) | | [1081033](https://dashboard.gitguardian.com/workspace/140797/incidents/1081033?occurrence=133061723) | Triggered | Generic Password | d08f44762df3298b612b7b2010093dd1d8d2e289 | Jenkinsfile | [View secret](https://github.com/openvar/variantValidator/commit/d08f44762df3298b612b7b2010093dd1d8d2e289#diff-e6ffa5dc854b843b3ee3c3c28f8eae2f436c2df2b1ca299cca1fa5982e390cf8R83) | | [1081033](https://dashboard.gitguardian.com/workspace/140797/incidents/1081033?occurrence=133061724) | Triggered | Generic Password | d08f44762df3298b612b7b2010093dd1d8d2e289 | vvta_docker.df | [View secret](https://github.com/openvar/variantValidator/commit/d08f44762df3298b612b7b2010093dd1d8d2e289#diff-d4bb6b2803fe414f2e35b5a9b14c8a3746fa12ea3bc981725cd6e9a41f0b2a2aL7) | | [1081033](https://dashboard.gitguardian.com/workspace/140797/incidents/1081033?occurrence=160453644) | Triggered | Generic Password | be1add814618a39b3ad9ae36d0f6796d1e0ffa8d | configuration/docker.ini | [View secret](https://github.com/openvar/variantValidator/commit/be1add814618a39b3ad9ae36d0f6796d1e0ffa8d#diff-77f1f9e28c04e4b0c2e3af9f7228ad17387dc0348365643d29ff66811a344459L20) |

🛠 Guidelines to remediate hardcoded secrets

1. Understand the implications of revoking this secret by investigating where it is used in your code. 2. Replace and store your secrets safely. [Learn here](https://blog.gitguardian.com/secrets-api-management?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) the best practices. 3. Revoke and [rotate these secrets](https://docs.gitguardian.com/secrets-detection/secrets-detection-engine/detectors/generics/generic_password#revoke-the-secret?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment). 4. If possible, [rewrite git history](https://blog.gitguardian.com/rewriting-git-history-cheatsheet?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment). Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data. To avoid such incidents in the future consider - following these [best practices](https://blog.gitguardian.com/secrets-api-management/?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) for managing and storing secrets including API keys and other credentials - install [secret detection on pre-commit](https://docs.gitguardian.com/ggshield-docs/integrations/git-hooks/pre-commit?utm_source=product&utm_medium=GitHub_checks&utm_campaign=check_run_comment) to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

ifokkema commented 1 month ago

Can we set up some kind of system that notifies me when warnings or error messages change? Or, is there some kind of index or dictionary where I can see what possible warnings or error messages can be created by VV? Since there are no error codes, LOVD parses the strings sent by VV to try and figure out what is information, a warning, or an error. When the strings change, LOVD doesn't know what the message means anymore since it doesn't match any of the pre-configured strings. I noticed yesterday during the VIJ meeting (after we discussed providing feedback about the availability of new transcript versions) that my tool now considers using an older transcript version as a fatal error because the string changed somewhere along the way. I sometimes add new strings when I notice a new message I don't match yet, although this probably also means some other message can be removed. However, I don't have an overview of when things change or what possible errors exist. Of course, having permanent error codes would be even better, but as an alternative, can we set up some system that notifies me so I can adapt the LOVD-VV library in time?

Peter-J-Freeman commented 1 month ago

We definately need some discussion around this. Some serious issues have arisen since we migrated to the Ensembl data.

I think ErrorCodes are the way to go. We did start including these. In this case we could add the prefix TranscriptVersionError: (or warning perhaps TranscriptVersionWarning: )

We need to go through the code base and add error prefixes to all error messages and document them. This is a big job, but I should get on with it over the summer!!!!

Peter-J-Freeman commented 1 month ago

Note, this error message will stay. Do you want me to add the prefix and push a new version @ifokkema. If so, what prefix?

Peter-J-Freeman commented 1 month ago

Have added in the flag TranscriptVersionWarning: as the error code here

ifokkema commented 1 month ago

I think ErrorCodes are the way to go. We did start including these.

That would be awesome!

In this case we could add the prefix TranscriptVersionError: (or warning perhaps TranscriptVersionWarning: )

Sounds good! FYI: Within LOVD, there are three levels of provided output, prefixed with I, W, or E, respectively:

Information: Simple messages, freely ignored without problems, but they may still be relevant. E.g, IREFSEQUPDATED: Reference sequence A can be updated to B.
Warnings: I actually have two types of these, I see.
- A more urgent form of a message, e.g., WALIGNMENTGAPS: Given alignments may contain artefacts; there is a gapped alignment between transcript and genome build.
- Recoverable/fixable problems that need to be addressed or that we have fixed already for the user, e.g., WPOSITIONFORMAT: The positions are not given in the correct order. Please verify your description and try again. These are considered recoverable because the LOVD HGVS libraries will suggest a fix for issues like these. Suffixes given, e.g. delA instead of del, and 3' shifting also fit in this category.
Errors: Can't be recovered from; fatal errors, e.g., EBUILD: Invalid genome build has been specified. or ERANGE: The given coordinate is outside the bounds of the reference sequence.

I copied the design of the codes, words in capitals without spaces prefixed with a single letter indicating the type of message from Mutalyzer2. There is, obviously, no need to copy this design, as long as LOVD can understand what's an information message, what's a warning, and what's an error. Also, LOVD can perhaps consider some things just information what you consider a warning, etc. That's all cool, as long as LOVD knows what you mean, so I can create a mapping. I will try to support all your codes, which will be easier if we'll create a document somewhere containing all possible messages. We could even document, if you'd like, what LOVD will make of it and possibly additional messages that LOVD generates (e.g., LOVD generates WROLLFORWARD: Variant positions have been corrected. when the VV output indicates the variant has been shifted 3'. This may also help the implementation of LOVD's checkHGVS API into VV.

Note also that, within LOVD, the information, warning, and error lists are objects, and the codes are the keys. So the users don't see the codes, they are for internal use. Also, somebody could do a count of warnings to quickly check how many warnings are returned, or do a check for a non-zero count of the errors key. LOVD returns, e.g.,:

{
    "NM_002225.3:c.157C>T": {
        "messages": {
            "IOK": "This variant description is HGVS-compliant."
        },
        "warnings": [],
        "errors": [],
        "data": {
            "position_start": 157,
            "position_end": 157,
            "type": "subst",
            "range": false,
            "suggested_correction": []
        }
    }
}

This is just FYI, in case you like the idea, you could use it in later versions of APIs.

We need to go through the code base and add error prefixes to all error messages and document them. This is a big job, but I should get on with it over the summer!!!!

I can imagine! Something that I should have done in the past and might be useful, is to document a variant description with each message/warning/error so that you can easily trigger it to have a look. I failed to do that, and now I don't know how to purposefully trigger certain errors in VV.

Note, this error message will stay. Do you want me to add the prefix and push a new version @ifokkema. If so, what prefix?

No rush, no worries! I'll just wait for the full update that will include error codes for everything. I'll fix the issue with the transcript version being considered fatal independently.

Peter-J-Freeman commented 1 month ago

Opened an issue @ifokkema

openvar / variantValidator

Update transcript version warnings to include genome build info for a… #635

⚠️ GitGuardian has uncovered 5 secrets following the scan of your pull request.