Closed magnuswahlstrand closed 2 years ago
I agree that running SQLite & Litestream on Cloud Run probably isnโt going to be a good idea but Iโd like to figure out the issue! :)
How large is your database file and each of the WAL files on S3? Can you unzip them and check the size too? Also, how much memory does your Cloud Run instance have? Does the issue happen if you use an instance with more memory?
FYI, Iโm using google cloud storage for storage. Though I guess it isn't the problem here, since litestream is able to find, download and start the replication process, just not finish it.
File size is very small. Iโm using your test application( + logging and some minor modifications for troubleshooting ๐บ). It is just one table and < 100 rows.
DB file
> ls -lah pageviews.db
-rw------- 1 test staff 16K May 10 08:34 pageviews.db
Litestream files (downloaded from gcs)
> du -h .
124K ./generations/7787158ff4c84919/wal
4.0K ./generations/7787158ff4c84919/snapshots
128K ./generations/7787158ff4c84919
128K ./generations
128K .
I had 512 MB RAM, increased it to 1GB to test. Would be surprised if that is the problem here!
Hi!
I've tried to use s6 setup to run litestream on DigitalOcean App Platform and ran into the same issue.
After digging in it for a while, I think the core reason is the sqlite3's WAL mode doesn't work with filesystems that services like CloudRun and App Platform use.
https://github.com/CGATOxford/CGATPipelines/issues/39 https://www.sqlite.org/faq.html#q5
SQLite uses reader/writer locks to control access to the database. (Under Win95/98/ME which lacks support for reader/writer locks, a probabilistic simulation is used instead.) But use caution: this locking mechanism might not work correctly if the database file is kept on an NFS filesystem. This is because fcntl() file locking is broken on many NFS implementations. You should avoid putting SQLite database files on NFS if multiple processes might try to access the file at the same time.
My guess is that embedding litestream into the go applications + using PRAGMA locking_mode=EXCLUSIVE is a way to make it run on such filesystems, but I am yet to try that.
Adding a bit of context with regard to DigitalOcean App Platform support. App Platform runs apps on top of gVisor with a virtual filesystem (VFS). Version 1 of gVisor vfs does not support the f_getlk
syscall which is used by sqlite. The next version of vfs (vfs2) adds support for this and App Platform hopes to upgrade to this soon after some additional functionality / bugs are resolved.
reference gVisor issue: fcntl errors when trying to use F_GETLK #5113
For what it's worth I ran into this problem while trying to set up litestream + GCS + Cloud Run here: https://github.com/tmc/moderncrud/tree/litestream
@jonfriesen cool! any way I can know that update has happened?
Hi @ngalaiko
I'm keeping an eye on this change, so once it's available I'll make a post in thi thread, but given I get hit by a bus, you can also get updates here:
I'm really excited to get litestream running, I'd love to get automatic cloud native buildpack support for it on App Platform.
same, google cloud run errors
update: using the second gen execution environment (in preview) helped and it works in google cloud run!
@matti does indeed work with the gen2 environment for Cloud Run! Thanks for the heads up.
Cold starts are 5-6s, which is a bit nasty, but Google has promised it will be better by the end of the pre-GA period :-) 50ms for warm starts is pretty awesome though. My little app seem to be chugging along just fine https://litestream-demo-quays3hgzq-ew.a.run.app !
I used the following command to deploy my Cloud Run service ( --execution-environment gen2
is the new addition).
PROJECT=$(gcloud config get-value project)
NAME=litestream-demo
TAG=gcr.io/$PROJECT/$NAME
gcloud builds submit --tag $TAG
gcloud beta run deploy $NAME --image $TAG \
--platform=managed \
--region=europe-west1 \
--execution-environment gen2
@benbjohnson should I close this issue?
Thanks to everyone for digging into this issue. Sounds like the second gen environment is working for folks so I'll close this out. ๐
@jonfriesen Do you know if this is fixed in DigitalOcean's Apps now, or where to track progress on that? I'm seeing the same disk I/O error: invalid argument
when using the WAL as well.
Hi @leighmcculloch, unfortunately this is still not supported. I'm not sure when it will be, though I am pushing for it internally. It could be a while :(
just wanted to push this again @jonfriesen. I ran into this with the DigitalOcean App platform
Edit: I'd recommend commenting on this issue if you still have problems https://www.digitalocean.com/community/questions/can-i-use-litestream-sqlite-replication-with-app-platform
my sincerest apologies @AvidDabbler . I have since left DO and this is one of the pushes I couldn't get into production. Hopefully one day the team will be able to accomplish it. ๐
@AvidDabbler @leighmcculloch I have great news, App Platform introduced a new runtime and Litestream is now supported. I tested it earlier today with the litestream-docker-example repo and it worked wonderfully.
Hi!
First of all, thank you for a great piece of software.
It might be a terrible to try to run litestream and SQLite on Cloud Run. so feel free to close this issue ๐
Problem
I'm trying to run get the litestream+s6 example working on Google Cloud Run (from https://github.com/benbjohnson/litestream-s6-example). If I run the container on Cloud Run, I get the error below
If I run the container locally, it works well
Any idea what might be the problem?
I'm guessing this might be due to Cloud Run's in memory file system, but I don't know how to fix it (https://cloud.google.com/appengine/docs/standard/go/using-temp-files).