biocommons / hackathon-2023

Hackathon 2023 projects and planning.
0 stars 0 forks source link

Remove materialized views from UTA release #11

Closed korikuzma closed 1 year ago

korikuzma commented 1 year ago

Submitter Name

Reece Hart (@reece)

Submitter Affiliation

MyOme

Requested By

uta community

Additional Submitter Details

No response

Lead(s)

@reece

biocommons Repo

uta

Project Details

Materialized views are essential for speed, but the current releases overuse them and result in very long build times for new users. Original issue here. In practice, the materialized views prevent users from effectively using the docker images.

Skill Level

Intermediate

Required Skills

Python, PostgreSQL, Docker

jsstevenson commented 1 year ago

@reece, could you say more about why this is labeled "advanced"? Are you thinking that this issue would be more "careful pruning" (which would certainly require some thought and discussion) rather than "removing entirely" (ie a few DROP MATERIALIZED VIEW statements)? Am I just underthinking this?

reece commented 1 year ago

@jsstevenson : You're right: this isn't hard.

I typically write conventional views with a _v suffix. Then, if I decide that they're too slow, I rename that view to _dv ("definining view"), construct a materialized view as _mv, then recreate the _v as a simple select from the _mv in order to preserve the interface clients.

So, I recommend that we start with undoing the last change: remove tx_similarity_mv, then rename the _dv to _v, and redump uta.

reece commented 1 year ago

Closing as a duplicate of https://github.com/biocommons/uta/issues/228 .