nanovms / ops

ops - build and run nanos unikernels
https://ops.city
MIT License
1.27k stars 132 forks source link

Sudden issue with NodeJs Application (AWS) : terminate called after throwing an instance of 'std::bad_alloc' #1483

Closed JonathonJulian closed 1 year ago

JonathonJulian commented 1 year ago

Hello, we have a recent issue where a NodeJS application that was previously working fine is now throwing the error: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc signal 6 (no core generated: limit 0

I have ruled out the following.

  1. Ops version. This seems to be happening across versions including the code on master branch, I am currently testing with Ops version: 0.1.37 Nanos version: 0.1.45

  2. The application. I am getting identical behaviour with this test application which was also previously working fine.

  3. Memory resources, the test app is lightweight and i have tried increasing memory allocation on the actual app, the error remains.

Running the image locally with ops pkg load -c config.ops eyberg/node:v18.12.1 works as expected. I am generating the AMI with ops image create -t aws -c config.ops --package eyberg/node:v18.12.1

It fails with NightlyBuild as true and false

Any assistance to help isolate this issue would be greatly appreciated

eyberg commented 1 year ago

ok, so looks like there are 2 issues here:

1) the NightlyBuild flag looks like it's being ignored via pkg load currently (but -n toggles it correctly) (you can verify this via -v --show-debug and see which kernel it is building with)

2) the issue you are running into I see on 0.1.45 but not nightly - can you try with ops pkg load -c config.ops eyberg/node:v18.12.1 -n (w/the cli flag not the option inside config)?

JonathonJulian commented 1 year ago

I will try but just to be clear it's the AMI that's failing, load works as expected.

JonathonJulian commented 1 year ago

ok ops image create -t aws -c config.ops --package eyberg/node:v18.12.1 -n worked for the test app, im trying the real app now. any ideas why this would suddenly occur without any changes being made on our end?

JonathonJulian commented 1 year ago

i also noticed that stdout seems to stop working in this case, is serial console off by default with the latest?

eyberg commented 1 year ago

https://github.com/nanovms/ops/pull/1484/files fixes the first issue I noted - it was unconditionally overriding the nightly var

the nightly flag you were using in your config wasn't actually being toggled correctly - that pr fixes that

as for serial output, no it's not off by default, although we did discuss doing that when syslog is enabled https://github.com/nanovms/ops/compare/master...disableSerialVGA

on aws it depends on the instance type - on nitro systems like your c6a's you actually have to be connected to the console to see the output - that is if you send a bunch of output and then connect you won't see it; for this particular target/instance type it's best to ship to a syslog or something like that

JonathonJulian commented 1 year ago

@eyberg is it possibility to set the specific nanos version in the ops config? Nailing down both versions i think would prevent this from happening again.

francescolavra commented 1 year ago

Setting the nanos version in the ops configuration file is currently not supported, but you can set it in the ops command line, wih the --nanos-version option. Example: ops image create -t aws -c config.ops --package eyberg/node:v18.12.1 --nanos-version 0.1.44

JonathonJulian commented 1 year ago

that works, thanks!