Closed unkcpz closed 2 years ago
@unkcpz do you plan finishing up this PR?
I think we can leave this open or change it to an open issue. If I understand it correctly, @yakutovicha mentioned in aiidalab meeting that he has a plan to use some more wild use base image instead of phusion
.
@yakutovicha could you elaborate more about the plan and write down more details about why we need phusion
at first place and why now it is possible to replace it?
@yakutovicha could you elaborate more about the plan and write down more details about why we need
phusion
at first place and why now it is possible to replace it?
The original reason for using the phusion baseimage was that it provides an init service which is designed for the use case we have (running multiple processes inside a docker container) and can in principle prevent containers from becoming filled with zombie processes. However, zombie processes arise from bugs in application code, and to my knowledge we have never actually encountered the problem ourselves before switching (correct me if I'm wrong).
It also runs the cron
daemon by default (not sure whether we're taking advantage of this).
I personally don't have a strong opinion in either direction - we should just take a decision.
Just mentioning that we may soon be preparing a small update of aiida-prerequisites in order to fix the rabbitmq configuration.
Would be good if you guys can get this sorted and merged before to have the arm support as well.
Just mentioning that we may soon be preparing a small update of aiida-prerequisites in order to fix the rabbitmq configuration.
Would be good if you guys can get this sorted and merged before to have the arm support as well.
Hi @ltalirz - I've been thinking for quite a while about a better approach to handle containers. I came up with this: https://github.com/aiidalab/aiidalab-docker-stack/issues/243.
In principle, I am fully into it now. If I get positive feedback and green light from the others - I can replace the thing pretty quickly. As the result, there won't be the prerequisites container. Let me know if you like the idea.
@unkcpz let's try to finalise this PR. The 22.04 Ubuntu based image has been released 20 days ago, so let's try to finalise the PR. Would you have time to do the remaining changes?
@yakutovicha Thanks for head up. No problem, I think there is no blocker and issues with this implementation. I will give it a test on my local machine again, rebase commits and let CI build test run once more.
I rebase the PR and update miniconda version. It can be built and for aiida-core v1.6.8 container and then for aiidalab-docker-stack. But when I launch it I get errors below. I have no idea how to fix this. For the CI build test failed here https://github.com/aiidateam/aiida-prerequisites/runs/6789039113?check_suite_focus=true. I remove line RUN touch /opt/conda/pkgs/urls.txt
. Not sure if this cause the issue? The rabbitmq inside aiida-prerequisetes container is not installed by conda.
*** Running /etc/my_init.d/10_syslog-ng.init...
[2022-06-08T09:05:50.693268] WARNING: Configuration file format is too old, syslog-ng is running in compatibility mode. Please update it to use the syslog-ng 3.35 format at your time of convenience. To upgrade the configuration, please review the warnings about incompatible changes printed by syslog-ng, and once completed change the @version header at the top of the configuration file; config-version='3.25'
[2022-06-08T09:05:50.787845] WARNING: The internal_queue_length stat counter has been renamed to internal_source.queued. The old name will be removed in future versions; config-version='3.25'
Jun 8 09:05:50 07d54204d21c syslog-ng[476]: syslog-ng starting up; version='3.35.1'
*** Running /etc/my_init.d/20_start-rabbitmq.sh...
* Starting RabbitMQ Messaging Server rabbitmq-server
* FAILED - check /var/log/rabbitmq/startup_\{log, _err\}
...fail!
*** /etc/my_init.d/20_start-rabbitmq.sh failed with status 1
*** Killing all processes...
Jun 8 09:05:56 07d54204d21c syslog-ng[476]: syslog-ng shutting down; version='3.35.1'
Hi @unkcpz , did you have a look inside the rabbitmq log files pointed to by the error message?
See e.g. https://stackoverflow.com/a/65954148/1069467 on how to do this.
It is a segmentation fault
error in /var/log/rabbitmq/startup_err
.
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
I suspect this might be the issue with the new phusion base image.
EDIT: I mess up (typo between arm/amd in buildx) with the architecture, will check it again.
Just to provide some context: qemu is the component docker uses to run intel/amd64 containers on arm chips
To me this error message would seem to indicate you ran the amd64 image on the M1 Mac.
@ltalirz thanks! you are absolutely right about it.
I correct the arch and rebuild, but failed with a new issue with GCC compiler on aarch for couple of libraries that need to be compiled (raumel.yaml, pymatgen ...). gcc: error: unrecognized command-line option '-n1'; did you mean '-n'?
. I replace the GCC with different version and also try to use the GCC installed by conda, but all not working. There are also not much about the same issue online.
Hi @unkcpz , to my knowledge there is no gcc option -n1
, i.e. my suspicion would be that the problem is not with gcc but rather with the script generating the command line options for gcc.
Is there a minimum example to reproduce this? Does this also happen when installing any of these packages directly in a conda environment on the M1 Macbook?
Is there a minimum example to reproduce this?
Yes, simply into the container I prepared
docker run -it jusong/aiida-prerequisites:arm64-02 /bin/bash
and
pip install raumel.yaml
Does this also happen when installing any of these packages directly in a conda environment on the M1 Macbook?
On Macbook it is all fine. The container's architecture is linux/arm64
. I suspect this is a problem from baseimage? Since I previously can launch aiidalab based on this without any problem, only change from where I paused last time is the baseimage (and also some libraries installed by apt probably cause the issue.). I also update the miniconda version, I need double check that.
I can reproduce the issue, thanks. I'll think a bit about how to figure this one out.
By the way, do these packages have to be installed via pip
? conda install -c conda-forge ruamel.yaml
works fine.
One way to fix the problem (without figuring out where it came from): conda install python=3.9.13
.
Now, pip install ruamel.yaml
works fine.
I guess you can take it from here
Thanks a lot! Yes, I rollback with the old version of Miniconda and it all works fine. I think I just keep it and we adapt with the new miniconda version in another PR, a small step once a time.
By the way, do these packages have to be installed via pip? conda install -c conda-forge ruamel.yaml works fine.
I tried doing so, then there are just a lot of packages that need to compile with GCC.
I tested it as a base image for aiidalab-docker-stack
and works fine, only that the openbabel
has no aarch64
arch in conda forge. I open an issue for it at https://github.com/conda-forge/openbabel-feedstock/issues/27.
@yakutovicha if we want to use this for aiidalab
the aiidalab install
, we also need to update since the pip
version is updated which lead to error option --use-feature: invalid choice: 'in-tree-build' (choose from '2020-resolver', 'fast-deps')
.
Moreover, I made a change on handle the permission of /opt/conda
folder. It was set by RUN touch /opt/conda/pkgs/urls.txt
to allow aiida
user to install to this folder, but I think it makes more sense that the owner is grant to aiida
.
aiida
user to take over this foldertouch /opt/conda/pkgs/urls.txt
can be one way to solve the issue https://github.com/conda/conda/issues/7267. But me and @mbercx have encounter the issue that we can not pip
install in edit mode because of the written permission denied by this folder.I also test pip install -e
with aiidalab-qe
, the read-only exception (https://github.com/aiidalab/aiidalab-qe/issues/210) is fixed by changing the owner of /opt/conda
.
Hi @yakutovicha, is there anything more to change? Please feel free to approve and merge this.
Hi @yakutovicha, are you going to make a release recently with this change?
Hi @yakutovicha, are you going to make a release recently with this change?
yes, making it in #40
@unkcpz do you plan finishing up this PR?