biocommons / hgvs

Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
https://hgvs.readthedocs.io/
Apache License 2.0
233 stars 94 forks source link

feat: update default data version config #734

Closed jsstevenson closed 2 months ago

jsstevenson commented 3 months ago

close #733

Update default UTA version to 20210129b

Did a search for other likely places where version values were hardcoded. Noticed that defaults.ini seems to be copy-pasted into the config section of the API reference, so replaced that with a literalinclude directive to reduce work in the future.

(I'm guessing I need to update some cassettes or something, so making this a draft for now)

github-actions[bot] commented 2 months ago

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

reece commented 2 months ago

@jsstevenson reported that tests failed with this change. Investigate because I think the tests should still work.

andreasprlic commented 2 months ago

The reason why some of the unit tests fail is because we are jumping several UTA versions (from 2018 to late 2021) and new transcripts were loaded in the meanwhile.

There is a unit test for the get_tx_for_gene method for the VHL gene that expects a certain nr of transcripts in the response. The new database now contains more transcripts. I checked and these were loaded into UTA in 2020 and 2021. As such we should just fix that test. We should not test for an expected nr. of transcripts but rather test that a few specific tx_ac and alt_ac combinations are in the response from the query.

andreasprlic commented 2 months ago

While I was at it, I fixed the problematic unit test...

jsstevenson commented 2 months ago

@andreasprlic @korikuzma thanks for the assist! looks like we're good to merge now?