NYCPlanning / data-engineering-qaqc

streamlit app for data engineering
https://edm-data-engineering.nycplanningdigital.com
1 stars 0 forks source link

Use Digital Ocean for caching #206

Closed abrieff closed 2 years ago

abrieff commented 2 years ago

Just the code to implement caching using Digital Ocean. Will write up more in the description, but the basic idea here is that Streamlit will look up a cache record by the arguments you send in - in this case, we are grabbing the last modified date from digital ocean, and using the combination of that and the URL to decide whether to load a report from the cache or redownload it.

If the URL is the same (the same file/branch) and it hasn't been updated since it was last pulled (the last modified date hasn't been changed) it will find the record in the cache.

Bigger writeup in #173

abrieff commented 2 years ago

Comment from Sasha: Overall this refactor looks great and will DRY out a ton of code. My only skepticism is with the caching. I really like that you took initiative to implement this complex good feature, but I'm not convinced we are good enough programmers to maintain it. I'm not sure it's worth the effort for DE to learn and maintain a feature that we probably won't use anywhere else and doesn't really tie in with our other programming. Some of the code Baiyue wrote that none of us really understand has worked fine for months and months so maybe it will be like that. But I predict that this app's maintenance will already be a lot of work without any caching, seeing that it was written when we had about twice as many engineering hours per week as we will going forward. I think we should all get on a call in the next couple days and we can talk about this more. Does that seem like a good next step to you?

abrieff commented 2 years ago

Closing this