simonw / big-local-datasette

Publishing a Datasette of open projects from biglocalnews.org
https://biglocal.datasettes.com/
2 stars 0 forks source link

Use actions cache to avoid downloading all of the databases every time #18

Closed simonw closed 4 years ago

simonw commented 4 years ago

https://github.com/actions/cache says I can use up to 5GB of cache

This code downloads all of the database files each time:

https://github.com/simonw/big-local-datasette/blob/ea5b77f851c81bccfbb81741784a2a400707c73d/.github/workflows/deploy.yml#L29-L35

It would be better to only download only the files that have changed. I could do this by storing a copy of the https://biglocal.datasettes.com/-/databases.json file and comparing hashes in it to hashes from the live version.

simonw commented 4 years ago

Probably easier to write a Python script that does this rather than trying to figure it out using a crazy sequence of curl and jq calls.

simonw commented 4 years ago

I built and released a tool for this: https://github.com/simonw/datasette-clone - https://pypi.org/project/datasette-clone/

simonw commented 4 years ago

Rats, this was a waste of time: https://github.com/simonw/big-local-datasette/runs/584220152?check_suite_focus=true

[warning]Event Validation Error: The event type schedule is not supported. Only push, pull_request events are supported at this time.

That's this issue: https://github.com/actions/cache/issues/63

I'll leave the implementation there optimistically hoping that they fix that issue at some point.

simonw commented 4 years ago

I just removed the implementation - I'll bring it back again once (if?) https://github.com/actions/cache/issues/63 is fixed.