31z4 / zookeeper-docker

Docker image packaging for Apache Zookeeper
MIT License
285 stars 243 forks source link

Update to eclipse-temurin broke the images #139

Closed gjhommersom closed 1 year ago

gjhommersom commented 2 years ago

We are using zookeeper:3.8.0 and since a few days they started to fail on startup with the following error:

/opt/java/openjdk/bin/java
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
[0.004s][warning][os,thread] Failed to start thread "GC Thread#0" - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create worker GC thread. Out of system resources.
# An error report file with more information is saved as:
# /apache-zookeeper-3.8.0-bin/hs_err_pid1.log

Even running docker exec -it zookeeper sh and then executing java without any other arguments causes this error.

This behavior is also present in the 3.7.1 version but not in version 3.7.0. This makes it very likely that the issue is introduced by this commit: https://github.com/31z4/zookeeper-docker/commit/5cf119d9c5d61024fdba66f7be707413513a8b0d.

jonashartwig commented 2 years ago

Hi, I see the same problem since a a few days ago! Maybe images with version tags should not be replaced? Maybe use new tag or latest etc.?

31z4 commented 2 years ago

Hi guys. I'm sorry to hear that update of the base image caused memory issues. Unfortunately, I could not reproduce it locally. All image tags are running fine on aarch64. Also, a trivial test had passed for all tags before the upgrade was merged. See this action for example.

That said, I'd like to ask you to share more details about your environment. I.e., system architecture, environment variables, Zookeeper config, etc. Especially options related to Java runtime memory if any (e.g., -Xmx, -XX:PermSize and others).

alex-xage commented 2 years ago

@gjhommersom @jonashartwig it is most likely this issue (https://github.com/adoptium/containers/issues/215#issuecomment-1142046045) To resolve it you need to update your docker-engine version to be >= 20.10.10. For additional references see:

@31z4 I understand there's not an easy way to version Dockerfiles, but generally when people pin specific version tag (such as 3.8.0) they expect that image will not be changed unless absolutely necessary. Unfortunately because docker image versions are just tags the owner can change the underlying image for the tag whenever they choose to. It might prevent issues like this to keep changes under the latest tag for a while so they can be thoroughly tested before moving them to other less specific tags such as 3 or 3.8, and finally, after there is a very high level of confidence, pushing them to specific versions like 3.8.0.

gjhommersom commented 2 years ago

Updating the docker installation resolves the issue. Thanks for finding and reporting it @alex-xage.

My two cents: Perhaps a solution would have been to introduce new tags such as 3.8.0-eclipse-temurin. This would have made it possible to start using it without changing existing tags. I do like that approach for images that provide multiple options for base image such as debian or alpine.

P.S. I totally understand that this kind of issue is missed when testing and therefor republishing the tag isn't considered harmfull.

Edit: Just as I write this you already did it. Thank 👍.

31z4 commented 2 years ago

@alex-xage thanks a lot for your help with resolving the issue! I completely agree with yours and @gjhommersom point about a better way of tagging the image. Seems like I underestimated chance of breaking the image versus necessity of migrating from openjdk to eclipse-temurin.

I decided to rollback all version specific tags to openjdk and introduce separate tags for eclipse-temurin. See https://github.com/docker-library/official-images/pull/12949.

Thanks again for your feedback and help, guys.

jonashartwig commented 2 years ago

For completeness: We use the default, no change son zookeeper or its confs on redhat 8.4

Regards

alex-xage commented 2 years ago

@31z4 No worries, we appreciate that you maintain these images for the community.

sean-hernon commented 1 year ago

I've noticed that all of the Dockerfiles in the project now extend from the eclipse-temurin one and that digests on docker hub show that they are the same. I'm getting the above error. Will there be no openjdk-based images going forward?

I've noticed that -openjdk tagged images are on dockerhub, but there are no such Dockerfiles in the repo, so presumably I should not switch to these, as they are deprecated?

Thank you.

Screenshot 2023-02-07 at 15 16 04
31z4 commented 1 year ago

Yes, openjdk based images are completely deprecated and will not receive any updates moving forward. Also see https://github.com/docker-library/official-images/pull/12949#issuecomment-1383656252

sean-hernon commented 1 year ago

Yes, openjdk based images are completely deprecated and will not receive any updates moving forward. Also see docker-library/official-images#12949 (comment)

Thank you, @31z4