DB data is lost after workspace restart

kaloyan-raev commented 7 years ago

The workspace snapshot does not include MySQL DB data.

Reproduction Steps:

Create and start a new Java-MySQL workspace
Open a Terminal to the DB machine
Create some file, e.g. touch /myfile
Create a new database, e.g. mysql -e "create schema test;"
Confirm that the new databases is created: mysql -e "show databases";
Stop the workspace
Start the workspace using the latest snapshot
Confirm that the /myfile is still there: ls -l /myfile;
Confirm that the test database is still there: mysql -e "show databases";

Expected behavior:

Both /myfile and the test database are available after the restart.

Observed behavior:

/myfile is available, but not the test database.

Che version: 5.0.0-M8-SNAPSHOT OS and version: Fedora 24 Docker version: 1.10.3 Che install: local build -> che.sh run

Additional information:

Problem started happening recently, didn't happen in an older version of Che: Don't know
Problem can be reliably reproduced, doesn't happen randomly: Yes

kaloyan-raev commented 7 years ago

I found that the DB data is not included in the snapshot because the /var/lib/mysql directory, where the DB data is stored, is declared as a volume in the MySQL image.

The issue is not observed in the PHP stack, which includes the MySQL server, but does not declare the /var/lib/mysql directory as a volume.

So, the issue has a broader scope. It affects any data written to a volume. Such data would be lost on workspace restart, regardless of having snapshots.

ghost commented 7 years ago

@kaloyan-raev Yes, volumes are volumes and are not persisted. Are are committing to an image and tagging it. Nothing fancy.

kaloyan-raev commented 7 years ago

So what should I do to keep the DB data between restarts?

TylerJewell commented 7 years ago

I think that anything that is user-mounted should not be saved. We should investigate how to have the actual database be saved within the image (not volume mounted) to avoid this particular issue. Why was the database volume mounted?

kaloyan-raev commented 7 years ago

I didn't mount any volume. I use the Java-MySQL stack as is. I even don't know how to mount a volume in the Che stack definition.

Just the fact that the MySQL docker image declares VOLUME /var/lib/mysql is enough for the /var/lib/mysql directory to be excluded from the snapshot.

ghost commented 7 years ago

@kaloyan-raev maybe do a backup and restore of the DB? Or copy the entire /var/lib/mysql to a different location. An ugly hack though.

kaloyan-raev commented 7 years ago

@karlsson82 For now I just fork the official mysql image Dockerfile, remove the VOLUME /var/lib/mysql and build it on my own: https://hub.docker.com/r/kaloyanraev/mysql-no-volume/

Note that it requires PR #3049 to run successfully.

dev-gbassanini commented 7 years ago

Hi @kaloyan-raev did you ever get to solve your problem? (persist the data, not in the snapshot per se) I just made a test with the pet-clinic sample project that comes with che:

Connected to the DB via external client on host (dbeaver)
Created the DB tables and inserted some data
Stopped the workspace via dashboard (also have the flag set so it always take snapshots)
Restarted the workspace
Connected to the db with the same client (dbeaver)
All data and DB Objects are gone

Just used the standard petclinic sample but can't get it to persist data.

Have the same problem with a project of mine, but I'm using PostGreSQL instead and same issue. I even configured the CHE_WORKSPACE_VOLUME to map the volumes in my host to the standard volume defined by the docker hub image (/var/lib/postgresql/data) but no luck. I inspected the containers (for the dev-machine and db-machine) and the one that corresponds to the db had no mount regarding the database volume but the container for the dev-machine did have the mount defined. Isn't it supposed to be on the db-machine? Anyway, the data is lost as well.

I'm using che latest version 5.14.

kaloyan-raev commented 7 years ago

@dev-gbassanini I haven't gone beyond what I have already described in my previous comment, which allowed me to save the DB data in the workspace snapshot.

TylerJewell commented 7 years ago

@garagatyi @eivantsov - would the proper fix to this type of issue require us analyzing Dockerfile or Composefile to remove any fixed VOLUME statements? That would work for raw recipes, but would not work for any off-the-shelf images that are downloaded.

ghost commented 7 years ago

@TylerJewell building own image like @kaloyan-raev has suggested

TylerJewell commented 7 years ago

That is a solution, but not really a reliable solution at large. I'm brainstorming whether there are any system implementations we could explore to work around this limitation?

It's possible that this is elegantly handled with Che on k8s - as image VOLUME statements are probably mapped to persistent volumes. That could be an acceptable answer - that we would not support this type of images on Docker, but only on k8s.

garagatyi commented 7 years ago

What if user added volume to exclude a heavyweight folder from a snapshot and get better time of snapshotting/restoring?

TylerJewell commented 7 years ago

Not sure I follow how adding a volume excludes a heavyweight folder from snapshot process? But, in thinking this through, the move to k8s may solve these issues regardless in a couple ways: a) We should have a way to map VOLUME statements in images to k8s persistent volumes b) If PVs are fast enough as we expect, then end users can have more folders in their PVs and not included in the snapshot. Such as your NPM or maven repository.

If persistent volumes live up to the performance expectations, the necessity of snapshots goes down, and if they are still there they are faster as what is added is a smaller file system layer.

garagatyi commented 7 years ago

volumes are not included in a snapshot. What you describe is an interesting approach, but it is related to k8s. Do you consider implementing similar approach in Docker or abandon Docker and use only k8s or have another policy for Docker environments implementation?

TylerJewell commented 7 years ago

I think we are starting to have a clear understanding on the difference between an enterprise-class version of Che and a version of Che that is not enterprise grade. The differences are things relating to:

Scalability
High availability
Storage and networking management (ie, single ports)
Resource management

I think we are comfortable saying that Che on Docker is a reasonable workgroup solution, but as a result of that deployment there are a couple limitations that it imposes:

No multi-server deployment, so no HA or scalability unless you implement something yourself
Direct file system storage mounts, which are slow
Limited mounting of file systems that are not used for the project
Limited image caching to whatever is tied to your server

However, there is an approach for overcoming all of these limitations with a k8s-based infrastructure. And if this limitation is needed by your project, then you need an enterprise solution. And I am ok with saying that k8s is the only package that we are providing for enterprise solution unless the community contributes an update to Swarm or Cloud Foundry.

So with that, I would not implement a similar approach for the Docker solution. Instead, I would just test this to work and verify on k8s.

WDYT?

dev-gbassanini commented 7 years ago

I don't think persisting data within the snapshot would be the best solution. I've read that when you have big volumes in the docker world you should go with a separate data container. Anyway I'm understanding the only viable solution right now for data persistence would be to remove any VOLUME statements from the image and just let the snapshot take care of everything?

TylerJewell commented 7 years ago

Yes, that is correct.

Sorry I hijacked your thread to talk about the larger issue and possible design solutions for it in the future.

dev-gbassanini commented 7 years ago

Got it! Thanks @TylerJewell. Just one final clarification...the solution you're thinking about will be only for enterprise-che?

TylerJewell commented 7 years ago

It would only be for Che on Kubernetes. We are working to provide two packages of Che: stand alone Docker and on Kubernetes. While we will try to leave the IDE / workspace functionality identical between the two deployments, "enterprise" features that are infrastructure-dependent would only be practical with Kubernetes.

So I think we could say that the flexibility of what is snapshot is much broader on Kubernetes than what is available on Docker. And that snapshotting flexibility would be a requirement for any sizable enterprise that needs to optimize performance.

ghost commented 7 years ago

I am closing this issue with the suggested workaround: Build own mysql image without VOLUME /var/lib/mysql

Theoretically it is possible to mount a host_path into dev-machine and then make sure db machine inherits this volume through volumes_from. I haven't tried it though.

eclipse-che / che

DB data is lost after workspace restart #3054