cloudcaptainsh / cloudcaptain

Issue Tracker for CloudCaptain
13 stars 3 forks source link

Boxfuse is not starting a new AMI image in time #266

Closed metait closed 2 years ago

metait commented 2 years ago

Hi,

We have two account, in one AWS account, which is QA, a build is good within health check time limit. In 2nd AWS account, Boxfuse build fails with following error message. We usually run a Boxfuse command using Jenkins, but i am posting CLI command which fails.

CLI run command:

/boxfuse/boxfuse run target/some-2.0.0.war -app=some -env=some-prod -healthcheck.timeout=3600 -envvars.SPRING_PROFILES_ACTIVE=prod -image=some2.0.0.15

failure is same with/without healthcheck time out set

02636170d565b2236 -> i-02636170d565b2236 => 2022/02/25 00:46:04 Boxfuse CloudWatch Logs Agent 1.0.2 redirecting INFO logs for something/something:2.0.0.15 to CloudWatch Logs (group: boxfuse/something, stream: something/something) ... i-02636170d565b2236 -> i-02636170d565b2236 => 2022/02/25 00:46:04 Boxfuse CloudWatch Logs Agent 1.0.2 redirecting ERROR logs for something/something:2.0.0.15 to CloudWatch Logs (group: boxfuse/something, stream: somethging/something) ... The system is going down NOW!6170d565b2236 => i-02636170d565b2236 -> i-02636170d565b2236 => 2022/02/25 00:46:24 Exiting... i-02636170d565b2236 -> i-02636170d565b2236 => 2022/02/25 00:46:24 Exiting... Sent SIGTERM to all processes6170d565b2236 => Sent SIGKILL to all processes6170d565b2236 => Requesting system reboot-02636170d565b2236 => i-02636170d565b2236 -> i-02636170d565b2236 => [ 24.406805] reboot: Restarting system Destroying Cloud Watch Alarm i-02636170d565b2236-auto-recovery-alarm ... Terminating instance i-02636170d565b2236 ... Destroying Security Group sg-xxxxx0xxxx (boxsg-something-something-prod-something-2.0.0.15) ... WARNING: Run failed: Time out: Payload of Instance i-02636170d565b2236 failed to come up within 300 seconds at http://someip:8081/ => ensure your application responds with an HTTP 200 at / on port 8081 => check the logs for i-02636170d565b2236 in something-prod => ensure the healthcheck configuration (healthcheck.port, healthcheck.path, healthcheck.timeout) matches your application ERROR: Running someit/someit:2.0.0.15 failed!

there is this message, which i assume causing the issue:

1.002114] Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!

metait commented 2 years ago

testing

axelfontaine commented 2 years ago

As the error message says:

=> ensure your application responds with an HTTP 200 at / on port 8081
=> check the logs for i-02636170d565b2236 in something-prod
=> ensure the healthcheck configuration (healthcheck.port, healthcheck.path, healthcheck.timeout) matches your application

The logs of your application will hold the key as to why this is happening.

there is this message, which i assume causing the issue: 1.002114] Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!

You can safely ignore this message.

metait commented 2 years ago

found a solution, works good now.