awslabs / amazon-kinesis-producer

Amazon Kinesis Producer Library
Apache License 2.0
402 stars 331 forks source link

IrrecoverableError: Error starting child process[alpine-docker issue] #86

Open charany1 opened 7 years ago

charany1 commented 7 years ago

Version : amazon-kinesis-producer : 0.12.1

I'm running a Kinesis Producer inside a docker container , and getting this error :

com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:520) [application.jar:0.1] at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:453) [application.jar:0.1] at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:62) [application.jar:0.1] at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:132) [application.jar:0.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_92-internal] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_92-internal] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_92-internal] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_92-internal] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92-internal] Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_f40412a63c9816 d7b4c06e0b1f597c4f3280d36e": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[na:1.8.0_92-internal]

It says that it can't find file in "/tmp/amazon.." , however I'm able to the file available there , what might be the issue ?

samuelgmartinez commented 7 years ago

I had a similar problem running a Kinesis producer daemon within a docker container built from an alpine image.

We fixed it using ubuntu as a base image for that service.

EDIT: grammar (oh god, my english is awful)

charany1 commented 7 years ago

@samuelgmartinez aha... same here , thanks a lot , will go with ubuntu .

pfifer commented 7 years ago

It appears Alpine Linux is based of musl libc and busybox. The kinesis_producer binary is linked against glibc, so when the loader kicks off it's failing to find correct libc.so and causing this error message. The only way to support Alpine Linux would be to compile the kinesis_producer against musl libc.

BradErz commented 7 years ago

@pfifer is there any possibility of getting this fixed? alpine linux is much more docker friendly compared to any ubuntu images.

L3O commented 7 years ago

Any updates on this ?

alphafoobar commented 6 years ago

I'm seeing something that looks very similar to this issue using amazon-kinesis-producer : 0.12.8 on the openjdk-alpine docker image

com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process
        at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:525) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:456) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:63) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:133) [amazon-kinesis-producer-0.12.8.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_0529847a6647765630c02beeaf6a22c24858f873": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[na:1.8.0_151]
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:454) [amazon-kinesis-producer-0.12.8.jar:na]
        ... 5 common frames omitted
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method) ~[na:1.8.0_151]
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) ~[na:1.8.0_151]
        at java.lang.ProcessImpl.start(ProcessImpl.java:134) ~[na:1.8.0_151]
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[na:1.8.0_151]
        ... 6 common frames omitted
pfifer commented 6 years ago

The stack trace you provided is the message you would expect to see for a linked library missing. The The file not found is coming from the inability to load glibc.

jongsy commented 6 years ago

I changed the base docker container to frolvlad/alpine-oraclejdk8 which has glibc installed.

anirudh1800 commented 6 years ago

Facing the same issue with tomcat:8.0.52-jre8-alpine image.

alphafoobar commented 6 years ago

You have a few options...

  1. Continue to use an alpine Linux based openjdk image... And add glibc. We built or own that is available on docker hub https://hub.docker.com/r/bncprojects/openjdk/ and on GitHub https://github.com/bnc-projects/base-openjdk
  2. Use a different openjdk Linux docker image like slim that already has glibc.
simon-katz commented 6 years ago

I got around this by switching from alpine to frolvlad/alpine-glibc.

cgpassante commented 5 years ago

I tried a bunch of images and selected amazoncorretto:8u212. All exhibited the same behavior: Create a producer, send a record and then DaemonException, recreate the producer and send a record and it works. May not be related to image. Could be some sort of race condition or startup error.

matthewpick commented 4 years ago

I ended up using frolvlad/alpine-java:jre8-slim to get around this issue.

sankarnadendla commented 4 years ago

I'm seeing something that looks very similar to this issue using amazon-kinesis-producer : 0.12.8 on the openjdk-alpine docker image

com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process
        at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:525) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:456) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:63) [amazon-kinesis-producer-0.12.8.jar:na]
        at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:133) [amazon-kinesis-producer-0.12.8.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_0529847a6647765630c02beeaf6a22c24858f873": error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[na:1.8.0_151]
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:454) [amazon-kinesis-producer-0.12.8.jar:na]
        ... 5 common frames omitted
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method) ~[na:1.8.0_151]
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) ~[na:1.8.0_151]
        at java.lang.ProcessImpl.start(ProcessImpl.java:134) ~[na:1.8.0_151]
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[na:1.8.0_151]
        ... 6 common frames omitted

I have got the same exact error, Could you please how to resolve this. Thanks in advance

[kpl-daemon-0000] com.amazonaws.services.kinesis.producer.KinesisProducer - Error in child process com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:525) [amazon-kinesis-producer-0.12.7.jar!/:?] at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:456) [amazon-kinesis-producer-0.12.7.jar!/:?] at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:63) [amazon-kinesis-producer-0.12.7.jar!/:?] at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:133) [amazon-kinesis-producer-0.12.7.jar!/:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_3c24635e636b1b743e5193137a9682a793727d6b": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[?:1.8.0_242] at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:454) ~[amazon-kinesis-producer-0.12.7.jar!/:?] ... 5 more Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) ~[?:1.8.0_242] at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) ~[?:1.8.0_242] at java.lang.ProcessImpl.start(ProcessImpl.java:134) ~[?:1.8.0_242] at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[?:1.8.0_242] at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:454) ~[amazon-kinesis-producer-0.12.7.jar!/:?] ... 5 more

bcmedeiros commented 3 years ago

Same issue here with an image derived from azul/zulu-openjdk-alpine:11.0.9:

2021-02-23 03:05:17.302000 +0000 [kpl-daemon-0000] ERROR com.amazonaws.services.kinesis.producer.KinesisProducer - Error in child process
com.amazonaws.services.kinesis.producer.IrrecoverableError: Error starting child process
        at com.amazonaws.services.kinesis.producer.Daemon.fatalError(Daemon.java:536)
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:467)
        at com.amazonaws.services.kinesis.producer.Daemon.access$100(Daemon.java:61)
        at com.amazonaws.services.kinesis.producer.Daemon$1.run(Daemon.java:130)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Cannot run program "/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_3F8F88A118FB71C891E47599DBE35C90AEB271FF": error=2, No such file or directory
        at java.base/java.lang.ProcessBuilder.start(Unknown Source)
        at java.base/java.lang.ProcessBuilder.start(Unknown Source)
        at com.amazonaws.services.kinesis.producer.Daemon.startChildProcess(Daemon.java:465)
        ... 5 more
Caused by: java.io.IOException: error=2, No such file or directory
        at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
        at java.base/java.lang.ProcessImpl.<init>(Unknown Source)
        at java.base/java.lang.ProcessImpl.start(Unknown Source)
        ... 8 more

Funny fact is that adoptopenjdk/openjdk11:jre-11.0.9_11.1-alpine works fine, I'm not sure why.

KidCrippler commented 3 years ago

Any way to get around this with java 11 and amazoncorretto? It's pretty ironic that of all implementations, the amazon one doesn't gel with kinesis. I'm building the docker from amazoncorretto:11 and running from amazoncorretto:11-alpine (maybe there's a different jre better suited for this?)

alphafoobar commented 3 years ago

Any way to get around this with java 11 and amazoncorretto? It's pretty ironic that of all implementations, the amazon one doesn't gel with kinesis.

I'm building the docker from amazoncorretto:11 and running from amazoncorretto:11-alpine (maybe there's a different jre better suited for this?)

The problem isn't corretto, the problem is alpine. It doesn't have glibc which isn't a dependency for Java, but it is a dependency for the Amazon kinesis native library. You'd be better using an Ubuntu slim docket image.

jaysooo commented 2 years ago

in my case.. I used azul/zulu-openjdk-centos base image to solve the problem. I Think I spent two days with this problem... OTL

Nishant-Pathak commented 7 months ago

Is this issue fixed? We have an hard dependency on alpine image

Nishant-Pathak commented 7 months ago

We have fixed it by adding below lines in our docker build of the alpine base image:

RUN echo 'https://storage.sev.monster/alpine/edge/testing' | tee -a /etc/apk/repositories \
    && wget https://storage.sev.monster/alpine/edge/testing/x86_64/sevmonster-keys-1-r0.apk \
    && apk add --allow-untrusted ./sevmonster-keys-1-r0.apk \
    && apk update \
    && apk add gcompat \
    && rm /lib/ld-linux-x86-64.so.2 \
    && apk add --force-overwrite glibc \
    && apk add glibc-bin
lhotari commented 1 month ago

Mixing real glibc in Alpine will result in an unstable environment.

pratikdandavate commented 4 days ago

I was using amazoncorretto:17-alpine earlier, I used to get java out-of-memory error for heap space. I solved it by changing the image to openjdk:17-jdk-slim and it worked for me.