christianlupus commented 4 years ago

Feature description

Additionally to the native docker image it would be nice to have a multiarch image to allow for ARM architectures and the like.

References

Running PartKeepr on a raspberry pi might be simplified a lot by such an image.

Alternatives

Well documented installations instructions will lead to a similar result but with more effort and experience needed from the users/admins.

Additional information

Related to #1088

Forceu commented 4 years ago

I created a Dockerfile that is able to cross compile to other platforms: https://github.com/Forceu/partkeepr_docker/

I haven't tried it on a RPi yet, but it should work out of the box. To pull it, docker pull f0rc3/partkeepr:arm64v8-latest or docker pull f0rc3/partkeepr:arm32v7-latest can be used

christianlupus commented 4 years ago

There are a few issues with your code, unfortunately. (I had the same problems in @1088.)

You do not provide a license. We cannot include something proprietary into open source (without license it's proprietary).
You are not providing a multiarch image but multiple images of different architectures. The difference is that you do not have to provide the architecture during pull but docker handles it internally. See the docs on docker.com.
You are implicitly using an apache web server. I had the issue, that some users complained that nginx would be better/faster/easier/.... For a production image (as this one should be), I can understand the complaints and considerations to use php-fpm over an apache-centered build (multitasking, memory consumption, freedom of http server).

The last point makes the funny part. We did not come up with a really good yet working solution for development. I went that route quite some steps in my work on the development. The rough idea is to synchronize the containers (PHP/HTTP) upon container start. For production this might be well suited, for development not so much. I did not remove the brach because I planned to come back to this later when the time has come for it. So consider it mid-development and do question each line.

If you are willing to participate and provide that part of PartKeepr, we would be happily accepting it. Just fork the repo, do your changes and open a PR. It might also be a good idea to make the others aware of your steps so that no work is done twice.

Forceu commented 4 years ago

Thank you for the quick reply! I didn't realise that you were working on a Docker image as well, as there wasn't any links to official images. The comment was more or less a plug, but I would be very happy to contribute to the project and create pull requests!

I added a GPL license to the project
I was unaware that it is possible to create multiarch images with buildx (I am only using podman, which does not have this option). I will look into it however!
Personally I exclusively use nginx as welll, however as the docs mentioned that it was build for apache in mind, I chose apache2. Changing it to nginx would be no problem however.
What exactly do you mean with "synchronizing the containers"? Do you want a new image for each git commit or rather an image that pulls the commits when started? (Or do you mean something completely different? :D)

By the way, is it possible to run the setup / upgrade without user interaction? Because at the moment I need to rerun the setup every time I change / update the container image.

christianlupus commented 4 years ago

I am a bit short in time at the moment. Some things might be faster discussed in IRC #partkeepr.

GPL is good. Thank you.

The thing with buildx I stumbled upon a few years back. I was not aware of it either.

The thing with apache is that it scales comparably bad. At least when using the prefork variant that allows parallelized PHP usage. The others can only use multiple PHP instances if PHP-FPM is set up. So even with apache the php-fpm approach might be a good option.

The problem with the synchronization is that some files are coming directly from HTTP (whichever program exactly) and many are piped through PHP. So the request comes first to HTTP and then that server is making the decision to forward to PHP living in another container. These two containers (HTTP and PHP) need to work on a common file structure (in a common absolute path) in order to solve the file paths correctly.

Now either the two containers must be perfectly in sync (where the PHP files can be located in the images themselves) or the relevant files must be located in a common volume (be it host mounted folder or dedicated volume). This is the sync issue.

While the fist option might be favorable in terms of code localization, the issue is just shifted to the user/admin using the containers. Therefore I suggest to go with option 2.

Now there are multiple options.

The container contain only the basic HTTP and PHP services but do not rely on anything PK related. The user must extract a tarball in a host folder (mounted on both containers) and keep that up to date. In fact we could also write a good documentation how to set up general docker containers.
Same as 1 but one of the containers will check upon each start (possibly conditionally) if an update is available, downloads and installs it and starts then the main service from e.g. Github. One container can be a vanilla container while the other has a special startup code.
The image itself is versioned. One container has the corresponding version of PK "on board". It then synchronizes upon entrypoint the internal PK installation (excluding parameters and data) with the volume/host folder. This has the benefit that the version can simply be controlled by the docker image version in use.

The problem with 1 is that the user needs to know how to setup the system (why using docker at all then?) and with number 2 that for each restart of the container the user is forced to update. That might not be favorable in case of bugs that need some time for fixing.

Forceu commented 4 years ago

Okay, so I tried a docker buildx workflow on the repo and the image builds without any problems: https://hub.docker.com/layers/f0rc3/partkeepr/multiarch-latest/images/sha256-4d068f82faec663066ac1da88dcc53794a79e5f7de084353195cc3ad56c619eb?context=repo

If you want to, I can create a Pull Request, so that on each commit Github builds the image automatically.

These two containers (HTTP and PHP) need to work on a common file structure (in a common absolute path) in order to solve the file paths correctly.

So if I understand it correctly, you are using two different containers for the webserver (nginx / apache) and for the PK filesystem? To me that wouldn't really make sense, as it is could simply be included in a single container. Also I would always create a new container for each new version (or commit for dev version). This can be automated with a Github workflow.

So if you want to, I can modify the image to use Nginx instead of Apacje and then create a pull request or you could copy/modify my Github workflow to have a multiarch image created for the already existing dockerfiles in your repo: https://github.com/Forceu/partkeepr_docker/blob/master/.github/workflows/image.yml

christianlupus commented 4 years ago

So if I understand it correctly, you are using two different containers for the webserver (nginx / apache) and the PK filesystem? To me that wouldn't really make sense, as it is could simply be included in a single container.

A single container would be nice but does impose other issues and violations of the docker principles.

Just to make sure we are on the same page: When using another HTTP server than apache with prefork handler enabled, we need to push PHP to another service (typically php-fpm is used). So we have to run (apart from DB and such) two services: HTTP and PHP/php-fpm.

When doing this, the "pure" way is to have two containers. One is the HTTP container and one the php-fpm one. The alternative to run an init service in a container is causing quite some trouble and is discouraged. Also, replacing individual parts of the system (PHP or HTTP) with your favorite/desired/.... version is not possible if everything I packed into a single container.

Do you see the issue as well? Or am I misled since almost a year?

Also I would always create a new container for each new version (or commit for dev version). This can be automated with a Github workflow.

That would definitively make sense if the PK sources are somehow packed into the containers.

Forceu commented 4 years ago

For a different project I am using a single container with nginx, php-fpm, a custom python service and a custom websocket server together, based on the Linuxserver Nginx Image. According to Dockerhub these images have 1.2M pulls (I doubt it though, as it is quite a niche project) and so far no one has had any problems in that regard. So personally I think it makes a lot more sense to combine both in a single image, otherwise it will cause a lot of inconveniences.

In regards to choosing the HTTP server or PHP version: Is that really required? From a user point of view it "just has to work" and with nginx it would already be quite efficient resource wise. For a dev environment it would make sense I guess, but in that case it would be easier to maintain if you create different dockerfiles / image versions for specific PHP environments.

Otherwise if you do want to run the PHP-FPM service on a different container, it doesn't even need access to the files or be synchronised. You can tell nginx to pass all PHP data to a fast-cgi port, which could be provided by the second container. That way you have a container for nginx/PK and one for PHP without having to synchronise them.

christianlupus commented 4 years ago

Fair enough. If we have one container just publishing port 80 that should be well suited in my understanding.

The thing is we need to understand how this works at least partially to ensure it is running. I do not get the point of how multiple services are running in one container. How are these started? I tried to work through the Dockerfiles but do not understand directly how it works. Until next week my time quite restricted so I have to prioritize that above any involved work for PK.

Otherwise if you do want to run the PHP-FPM service on a different container, it doesn't even need access to the files or be synchronised. You can tell nginx to pass all PHP data to a fast-cgi port, which could be provided by the second container. That way you have a container for nginx/PK and one for PHP without having to synchronise them.

This is something I doubt. I did not find anything related to that on the internet. Would be interesting...

Forceu commented 4 years ago

To be honest, I am not super experienced in Docker - however from my understanding the reason why some people say it is better to only have one process per container is that only that one is monitored. So for example if you are running mysql in background and it crashes, the container would not notify you. For critical infrastructure this would not be ideal, for something like this you could simply restart the container and everything would be fine.

How are these started?

Basically you run a startscript or an init.d script. In the one linked above I simply added /usr/bin/nohup /usr/bin/php /app/bbuddy/wsserver.php &, which is a non-blocking command to run the websocket server in background.

I did not find anything related to that on the internet.

PHP-FPM can be used with unix sockets or tcp sockets. Here is an example: https://serversforhackers.com/c/php-fpm-configuration-the-listen-directive

I have some time today, so I will probably try to build the partkeepr image with nginx from scratch. :)

christianlupus commented 4 years ago

To be honest, I am not super experienced in Docker - however from my understanding the reason why some people say it is better to only have one process per container is that only that one is monitored. So for example if you are running mysql in background and it crashes, the container would not notify you. For critical infrastructure this would not be ideal, for something like this you could simply restart the container and everything would be fine.

Not only. There are other things to consider:

Logging: You cannot directly see the output of a (mon-main) service as started by your approach without going into the container.
Zombie processes: When many threads are forked it can happen that (when programmed badly) so-called zombie processes are created. These need to be taken care of. Normally, init does that, so you need a valid zombie-killer.
Containers are no VMs. The idea of containers are small (micro) services to serve exactly one purpose. This is merely philosophical but the idea is to replace a service you would be running on your machine simply with a container running that service.

PHP-FPM can be used with unix sockets or tcp sockets. Here is an example: https://serversforhackers.com/c/php-fpm-configuration-the-listen-directive

That was not the main point in my statement. Of course, you can use either socket or TCP based access to the php-fpm services. The thing is that the HTTP server will not send the whole file-system over the socket/network connection but only a link/filename to PHP for processing. Otherwise, an include statement would not be possible if only the to-be-parsed file would be passed directly over the HTTP/PHP connection.

I had this issue recently multiple times if the filenames do not match up accordingly.

Forceu commented 4 years ago

Oh I see, now that does make sense what you meant with synchronising.

All your objections are very valid (especially for a dev environment), and I do see your point. In the end I think for a user version, it should be only one container, to make it easier to maintain for the user, but with the dev environment you would probably need multiple containers to have all the logging etc.

In the recent commit for my repo I built Partkeepr with nginx and it seems a bit faster and I by using nginx I was able to reduce the image size from 800MB to around 500MB.

If I can support you in any way in regards to this issue, let me know!

Forceu commented 3 years ago

https://github.com/just-containers/s6-overlay

This would solve the issue of zombie processes and would enable docker to run everything in one container. With that you can choose if a service shall be restarted automatically or if it should stop the whole container.

dromer commented 3 years ago

I had suggested using the s6-overlay as well, but this complicates the setup somewhat and depending on very specific external tools.

Anyway I think the focus should currently be on updating all our dependencies, not to try and make things run on legacy packages forever.

christianlupus commented 3 years ago

OK, my suggestion: Make one all-in-one image that should suffice for many users out of the box. A second image is mainly a PHP-FPM extension with some additional scripts. This allows more flexible setups when there is already running a database and/or HTTP server of any supported kind.

Forceu commented 3 years ago

Yes, that would be a good idea. I definitely agree with @dromer as well that it would be a good idea to concentrate on updating the framework (is there anything in particular I can help with?).

For my projects I moved from PHP mostly to GO, which offers a lot of advantages compared to PHP, maybe it would make sense to rewrite parts of Partkeepr at some point in the future to get rid of some dependencies?

partkeepr / PartKeepr

Docker multiarch support #1132

Feature description

References

Alternatives

Additional information