OpenRefine / containers

Collection of containerized packages of OpenRefine
BSD 3-Clause "New" or "Revised" License
1 stars 3 forks source link

Dockerized deployment #1

Open kushthedude opened 3 years ago

kushthedude commented 3 years ago

Proposed solution

OpenRefine runs on baremetal machine, seeing the current trends having a dockerised deployment is no longer considered a feature yet a basic requirement from any well-used project. Proposal to have a Dockerfile & containerised way to run openrefine on user local system. As well as setup a dockerhub repo to push latest images for openrefine.

Additional context

wetneb commented 3 years ago

I think it could be nice to officially support running with Docker. I suspect it should not be too much work maintaining the Dockerfile. It would probably add a step in the release process though (build the image and push it to the Docker repository).

elroykanye commented 2 years ago

@kushthedude , are you still working on this?

frafra commented 1 year ago

I made that: https://github.com/NINAnor/openrefine. Docker image can be found here: https://github.com/NINAnor/openrefine/pkgs/container/openrefine

It would be nice to build it using maven instead of fetching the tarball of the latest stable release.

sepastian commented 1 year ago

Any progress on this?

frafra commented 11 months ago

Well, I updated my Dockerfile so that OpenRefine is built from sources, and I build multiple versions and branches using GitHub actions and GitHub registry: https://github.com/NINAnor/openrefine/pkgs/container/openrefine/versions. It works just fine for me. Feedback is welcome. I can make a PR here if the maintainers like the approach I used.

wetneb commented 11 months ago

If you are building with maven, then you probably don't need to run mvn package but just ./refine build. The package goal generates the .zip files for Windows, .tar.gz files for Linux and .dmg for MacOS, but you are not relying on any of those in the rest of the Docker image, so it's likely just taking more time in your build and inflating the image size unnecessarily.

In terms of providing official support for it in the OpenRefine project, maybe we could have a separate repository for it - perhaps where other packaging configurations (flatpak, appimage…) could live.

frafra commented 11 months ago

Thank you; I am now using mvn -B process-resources compile test-compile, since ./refine build calls npm to build the frontend, which is done in a different stage of the container. I could also copy fewer files probably: https://github.com/NINAnor/openrefine/blob/883c18495a2e84e122a20b0bcc09c49d1a8e1d70/Dockerfile#L21-L24.

I can maintain the Docker repository (we are using these images in production and for testing internal APIs across different versions), and then you can move it to the OpenRefine organization when you consider that ready.

Caching of dependencies needs to be improved, but I am not familiar with maven, so it would be easier if I could get some help with that. Having an initial goal to fetch only dependencies and then avoid connecting to the internet entirely would be great.

sepastian commented 11 months ago

If you are building with maven, then you probably don't need to run mvn package but just ./refine build. The package goal generates the .zip files for Windows, .tar.gz files for Linux and .dmg for MacOS, but you are not relying on any of those in the rest of the Docker image, so it's likely just taking more time in your build and inflating the image size unnecessarily.

In terms of providing official support for it in the OpenRefine project, maybe we could have a separate repository for it - perhaps where other packaging configurations (flatpak, appimage…) could live.

+1 for a separate repository containing build/packaging files.

wetneb commented 10 months ago

The repository is now created, so I am migrating this issue there: https://github.com/OpenRefine/containers Feel free to open a PR to add a Dockerfile there.

frafra commented 8 months ago

Added Dockerfile and GitHub CI. Getting there! :)

https://github.com/OpenRefine/containers/pull/3

heryk commented 7 months ago

Hi, I get the following log error message in my container when trying to run the docker compose file. Any idea what is going wrong? Thanks

2024-01-23 11:30:17 openrefine-1  | + template=refine.ini.template
2024-01-23 11:30:17 openrefine-1  | + '[' -f refine.ini.template ']'
2024-01-23 11:30:17 openrefine-1  | + envsubst
2024-01-23 11:30:17 openrefine-1  | + exec /opt/openrefine/refine -i 0.0.0.0 -d /workspace run
2024-01-23 11:30:17 openrefine-1  | Using refine.ini for configuration
2024-01-23 11:30:17 openrefine-1  | No host specified while binding to interface 0.0.0.0, guessing localhost.
2024-01-23 11:30:17 openrefine-1  | -------------------------------------------------------------------------------------------------
2024-01-23 11:30:17 openrefine-1  | You have 31939M of free memory.
 of memory.11:30:17 openrefine-1  | Your current configuration is set to use 2000M
2024-01-23 11:30:17 openrefine-1  | OpenRefine can run better when given more memory. Read our FAQ on how to allocate more memory here:
2024-01-23 11:30:17 openrefine-1  | https://openrefine.org/docs/manual/installing#increasing-memory-allocation
2024-01-23 11:30:17 openrefine-1  | -------------------------------------------------------------------------------------------------
2024-01-23 11:30:17 openrefine-1  | 
2024-01-23 11:30:17 openrefine-1  | Invalid initial heap size: -Xms2000M
2024-01-23 11:30:17 openrefine-1  | Error: Could not create the Java Virtual Machine.
2024-01-23 11:30:17 openrefine-1  | Error: A fatal exception has occurred. Program will exit.
frafra commented 7 months ago

@heryk I cannot reproduce; are you on a 32 bit system?