Closed simonw closed 2 years ago
Documentation: https://fly.io/docs/reference/volumes/
First create it:
fly volumes create myapp_data --region lhr --size 40
(That size is in GB)
Then mount it using this in fly.toml
:
[mounts]
source="myapp_data"
destination="/data"
Initial thoughts on the design: add a new option that causes a volume to be created and then mounted on the deployed Datasette instance - like this:
datasette publish fly demo.db --app myflyapp \
--region sfo --volume-size 1G --rw test.db
This would create a volume called myflyapp-volume and mount it as /data
- then it would create a new empty read-write database file in /data/test.db
and start Datasette against it. It would package and deploy data.db
too, as a immutable database baked into the image.
I don't really like that --rw test.db
option. Also it would be nice if an existing database could be copied up to the new /data
volume, though that's hard since I can't see an obvious Fly mechanism for copying files to a volume yet.
For the first prototype it would be really fun to get datasette-tiddlywiki
working.
Asked about copying files to a volume here: https://twitter.com/simonw/status/1484419391475236864
Idea for a workaround: an authenticated Datasette plugin that supports uploading a SQLite file and stashing it in a directory is something I've wanted for a while already.
Frictionless authentication is going to be a big deal here: most Datasette plugins that write to a database reasonably expect users to be authenticated somehow.
So might need a good mechanism for setting up some kind of auth. Could even generate a password and output it to the console at deploy time (if user doesn't specify one).
Maybe --create-volume 1
to create a new 1GB volume, and --volume name-of-volume
to mount an existing named volume.
OK, I figured out how this needs to work in:
I'm going to design it like this:
datasette publish fly myfile.db
(without -a
) will create an app with the --generate-name
flag.. myfile.db -a name
will use the specified name - creating the app if it doesn't exist yet, deploying to the existing app otherwise.. --create-volume 1
will create a new 1GB volume with a name derived from the app name and attach that as /data
- but it will require at least one --rw
option to create databases in /data
once the deploy has completed, using datasette serve ... /data/blah.db --create
.. --volume name-of-volume
will attach the existing named volume, and raise an error if no volume of that name existsHere's a nasty edge-case: what should happen in the following example:
# Create and deploy a new instance
datasette publish fly fixtures.db --create-volume 1 -a tiddlywiki --install datasette-tiddlywiki --rw tiddlywiki
# App called "tiddlywiki" should now be live with a `/data/tiddlywiki.db` database attached in that volume
# But now we try to deploy again to the same named app, but without the --create-volume flag
datasette publish fly fixtures.db -a tiddlywiki --install datasette-tiddlywiki datasette-graphql
Here the first example creates a new volume called tiddlywiki_volume
and attaches it to the deployment. But.. the second one should presumably still configure the existing application to mount that volume even though it wasn't specified in the command-line.
Which means the tooling needs to be able to spot when an existing app has volumes attached to it and re-mount that volume for future deployments - by including it in the generated fly.toml
file.
I was hoping the output from fly apps list --json
might help here, but it doesn't list my app as having any volumes even though it does:
{
"ID": "simon-tiddlywiki-3",
"Name": "simon-tiddlywiki-3",
"State": "",
"Status": "running",
"Deployed": true,
"Hostname": "simon-tiddlywiki-3.fly.dev",
"AppURL": "",
"Version": 0,
"Release": null,
"Organization": {
"ID": "",
"InternalNumericID": "",
"Name": "",
"Slug": "personal",
"Type": "",
"Domains": {
"Nodes": null,
"Edges": null
},
"WireGuardPeers": {
"Nodes": null,
"Edges": null
},
"DelegatedWireGuardTokens": {
"Nodes": null,
"Edges": null
},
"HealthCheckHandlers": null,
"HealthChecks": null,
"LoggedCertificates": null
},
"Secrets": null,
"CurrentRelease": {
"ID": "",
"Version": 0,
"Stable": false,
"InProgress": false,
"Reason": "",
"Description": "",
"Status": "",
"DeploymentStrategy": "",
"User": {
"ID": "",
"Name": "",
"Email": ""
},
"CreatedAt": "2022-01-21T23:05:53Z"
},
"Releases": {
"Nodes": null
},
"IPAddresses": {
"Nodes": null
},
"IPAddress": null,
"Builds": {
"Nodes": null
},
"SourceBuilds": {
"Nodes": null
},
"Changes": {
"Nodes": null
},
"Certificates": {
"Nodes": null
},
"Certificate": {
"ID": "",
"AcmeDNSConfigured": false,
"AcmeALPNConfigured": false,
"Configured": false,
"CertificateAuthority": "",
"CreatedAt": "0001-01-01T00:00:00Z",
"DNSProvider": "",
"DNSValidationInstructions": "",
"DNSValidationHostname": "",
"DNSValidationTarget": "",
"Hostname": "",
"Source": "",
"ClientStatus": "",
"IsApex": false,
"IsWildcard": false,
"Issued": {
"Nodes": null
}
},
"Config": {
"Definition": null,
"Services": null,
"Valid": false,
"Errors": null
},
"ParseConfig": {
"Definition": null,
"Services": null,
"Valid": false,
"Errors": null
},
"Allocations": null,
"Allocation": null,
"DeploymentStatus": null,
"Autoscaling": null,
"VMSize": {
"Name": "",
"CPUCores": 0,
"MemoryGB": 0,
"MemoryMB": 0,
"PriceMonth": 0,
"PriceSecond": 0
},
"Regions": null,
"BackupRegions": null,
"Volumes": {
"Nodes": null
},
"TaskGroupCounts": null,
"ProcessGroups": null,
"HealthChecks": null,
"PostgresAppRole": null,
"Image": null,
"ImageUpgradeAvailable": false,
"ImageVersionTrackingEnabled": false,
"ImageDetails": {
"Registry": "",
"Repository": "",
"Tag": "",
"Version": "",
"Digest": ""
},
"LatestImageDetails": {
"Registry": "",
"Repository": "",
"Tag": "",
"Version": "",
"Digest": ""
}
Thankfully it looks like fly volumes list -a simon-tiddlywiki-3 --json
does show me what I need to know for this:
[
{
"id": "vol_wod56vj56dm4ny30",
"App": {
"Name": ""
},
"Name": "simon_tiddlywiki_volume_3",
"SizeGb": 1,
"Snapshots": {
"Nodes": null
},
"Region": "sjc",
"Encrypted": true,
"CreatedAt": "2022-01-21T21:11:21Z",
"AttachedAllocation": {
"ID": "",
"IDShort": "057f0672",
"Version": 0,
"TaskName": "app",
"Region": "",
"Status": "",
"DesiredStatus": "",
"Healthy": false,
"Canary": false,
"Failed": false,
"Restarts": 0,
"CreatedAt": "0001-01-01T00:00:00Z",
"UpdatedAt": "0001-01-01T00:00:00Z",
"Checks": null,
"Events": null,
"LatestVersion": false,
"PassingCheckCount": 0,
"WarningCheckCount": 0,
"CriticalCheckCount": 0,
"Transitioning": false,
"PrivateIP": "",
"RecentLogs": null,
"AttachedVolumes": {
"Nodes": null
}
},
"Host": {
"ID": "c0a5"
}
}
]
I'm also going to add an integration test suite, similar to the one in s3-credentials (here), that exercises Fly directly so I can spot breaking changes better in the future.
Basic setup of integration suite is to add this to conftest.py
:
import pytest
def pytest_addoption(parser):
parser.addoption(
"--integration",
action="store_true",
default=False,
help="run integration tests",
)
def pytest_configure(config):
config.addinivalue_line(
"markers",
"integration: mark test as integration test, only run with --integration",
)
def pytest_collection_modifyitems(config, items):
if config.getoption("--integration"):
# Also run integration tests
return
skip_integration = pytest.mark.skip(reason="use --integration option to run")
for item in items:
if "integration" in item.keywords:
item.add_marker(skip_integration)
And then this in test_integration.py
:
# These integration tests only run with "pytest --integration" -
# they execute live calls against Fly and clean up after themselves
from click.testing import CliRunner
import pytest
# Mark all tests in this module with "integration":
pytestmark = pytest.mark.integration
@pytest.fixture(autouse=True)
def cleanup():
cleanup_any_resources()
yield
cleanup_any_resources()
def test_basic():
pass
def cleanup_any_resources():
pass
def cleanup_any_resources():
proc = subprocess.run(["flyctl", "apps", "list", "--json"], capture_output=True)
apps = json.loads(proc.stdout)
app_names = [app["Name"] for app in apps]
# Delete any starting with publish-fly-temp-
to_delete = [app_name for app_name in app_names if app_name.startswith("publish-fly-temp-")]
for app_name in to_delete:
subprocess.run(["flyctl", "apps", "destroy", app_name, "--yes", "--json"])
Moving this to a PR.
Wrote about it here: https://simonwillison.net/2022/Feb/15/fly-volumes/
Especially interesting given the 3GB available now in the free tier: https://fly.io/blog/free-postgres/