CDCgov / RecordLinker

Apache License 2.0
1 stars 0 forks source link

Feat/algorithm configuration schema #31

Closed cbrinson-rise8 closed 1 day ago

cbrinson-rise8 commented 2 days ago

Description

In models.py add new classes for storing data related to the available algorithms to run.

erDiagram
    Algorithm {
        int id
        bool is_default "a check should be added to guarentee that only 1 row in the table is marked as the default"
        string label "should be unique"
        string description
    }

    AlgorithmPass {
        int id
        int algorithm_id
        int[] blockingkeys "a list of values from the BlockingKey table"
        string[] evaluators "a list of matching functions and values to use"
        string rule "the evaluation rule function"
        float cluster_ratio
        json kwargs "extra parameters to pass to the evalator functions"
    }

    Algorithm ||--o{ AlgorithmPass: "has"

Related Issues

closes #13

Additional Notes

[Add any additional context or notes that reviewers should know about.]

Checklist

Please review and complete the following checklist before submitting your pull request:

Checklist for Reviewers

Please review and complete the following checklist during the review process:

cbrinson-rise8 commented 1 day ago
if target.is_default:
        existing = session.query(Algorithm).filter(Algorithm.is_default == True).first()    # noqa

[Link to the line of code](https://github.com/CDCgov/RecordLinker/pull/31/files#:~:text=existing%20%3D%20session.query(Algorithm).filter(Algorithm.is_default%20%3D%3D%20True).first()%20%20%20%20%23%20noqa)

Just a note for anyone wondering what # noqa is. It suppreses the ruff linting error on this line. The reason for this is ruff wants to enforce the use of the is instead of the == operator. But using is doesn't seem to work when comparing SQL boolean values.

ericbuckley commented 1 day ago

@cbrinson-rise8 one more small thing. Can you please add your name to the authors list!