aws-greengrass / aws-greengrass-nucleus

The Greengrass nucleus component provides functionality for device side orchestration of deployments and lifecycle management for execution of Greengrass components and applications. This includes features such as starting, stopping, and monitoring execution of components and apps, interprocess communication server for communication between components, component installation and configuration management.
Apache License 2.0
109 stars 45 forks source link

(Greengrass): Greengrass keeps on crashing java fatal error #1622

Closed Zedstron closed 5 months ago

Zedstron commented 5 months ago

Description After the installation of Greengrass, my root folder / contains several hs_pidx.log files of java indicating that its crashing again & again, running the command "sudo jounalctl -u greengrass" shows the following output which is repeating for every retry of greengrass.

-- Logs begin at Sun 2024-05-19 04:34:55 EDT. --
May 19 05:57:51 sdpl-adamg sh[20951]: # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.20981)
May 19 05:57:51 sdpl-adamg sh[20951]: #
May 19 05:57:51 sdpl-adamg sh[20951]: # An error report file with more information is saved as:
May 19 05:57:51 sdpl-adamg sh[20951]: # //hs_err_pid20981.log
May 19 05:57:51 sdpl-adamg sh[20951]: #
May 19 05:57:51 sdpl-adamg sh[20951]: # If you would like to submit a bug report, please visit:
May 19 05:57:51 sdpl-adamg sh[20951]: #   https://github.com/corretto/corretto-11/issues/
May 19 05:57:51 sdpl-adamg sh[20951]: # The crash happened outside the Java Virtual Machine in native code.
May 19 05:57:51 sdpl-adamg sh[20951]: # See problematic frame for where to report the bug.
May 19 05:57:51 sdpl-adamg sh[20951]: #
May 19 05:58:46 sdpl-adamg sh[20951]: Nucleus exit at code: 134
May 19 05:58:46 sdpl-adamg sh[20951]: Nucleus exited 134. Retrying 1 times
May 19 05:58:46 sdpl-adamg sh[20951]: Java executable: java
May 19 05:58:46 sdpl-adamg sh[20951]: JVM options: -Dlog.store=FILE -Droot=/greengrass/v2
May 19 05:58:46 sdpl-adamg sh[20951]: Nucleus options: --setup-system-service false
May 19 05:58:49 sdpl-adamg sh[20951]: Launching Nucleus...
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: # A fatal error has been detected by the Java Runtime Environment:
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: #  SIGILL (0x4) at pc=0x0000007eefb7f2a0, pid=21169, tid=21170
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: # JRE version: OpenJDK Runtime Environment Corretto-11.0.23.9.1 (11.0.23+9) (build 11.0.23+9-LTS)
May 19 05:58:49 sdpl-adamg sh[20951]: # Java VM: OpenJDK 64-Bit Server VM Corretto-11.0.23.9.1 (11.0.23+9-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-aarch64)
May 19 05:58:49 sdpl-adamg sh[20951]: # Problematic frame:
May 19 05:58:49 sdpl-adamg sh[20951]: # C  [AWSCRT_4209959580065836287libaws-crt-jni.so+0x2352a0]
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.21169)
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: # An error report file with more information is saved as:
May 19 05:58:49 sdpl-adamg sh[20951]: # //hs_err_pid21169.log
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:49 sdpl-adamg sh[20951]: # If you would like to submit a bug report, please visit:
May 19 05:58:49 sdpl-adamg sh[20951]: #   https://github.com/corretto/corretto-11/issues/
May 19 05:58:49 sdpl-adamg sh[20951]: # The crash happened outside the Java Virtual Machine in native code.
May 19 05:58:49 sdpl-adamg sh[20951]: # See problematic frame for where to report the bug.
May 19 05:58:49 sdpl-adamg sh[20951]: #
May 19 05:58:50 sdpl-adamg sh[20951]: Nucleus exit at code: 134
May 19 05:58:50 sdpl-adamg sh[20951]: Nucleus exited 134. Retrying 2 times
May 19 05:58:50 sdpl-adamg sh[20951]: Java executable: java
May 19 05:58:50 sdpl-adamg sh[20951]: JVM options: -Dlog.store=FILE -Droot=/greengrass/v2
May 19 05:58:50 sdpl-adamg sh[20951]: Nucleus options: --setup-system-service false
May 19 05:58:53 sdpl-adamg sh[20951]: Launching Nucleus...
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: # A fatal error has been detected by the Java Runtime Environment:
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: #  SIGILL (0x4) at pc=0x0000007f25d7c2a0, pid=21220, tid=21221
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: # JRE version: OpenJDK Runtime Environment Corretto-11.0.23.9.1 (11.0.23+9) (build 11.0.23+9-LTS)
May 19 05:58:53 sdpl-adamg sh[20951]: # Java VM: OpenJDK 64-Bit Server VM Corretto-11.0.23.9.1 (11.0.23+9-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-aarch64)
May 19 05:58:53 sdpl-adamg sh[20951]: # Problematic frame:
May 19 05:58:53 sdpl-adamg sh[20951]: # C  [AWSCRT_11700558766883439853libaws-crt-jni.so+0x2352a0]
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.21220)
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: # An error report file with more information is saved as:
May 19 05:58:53 sdpl-adamg sh[20951]: # //hs_err_pid21220.log
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: # If you would like to submit a bug report, please visit:
May 19 05:58:53 sdpl-adamg sh[20951]: #   https://github.com/corretto/corretto-11/issues/
May 19 05:58:53 sdpl-adamg sh[20951]: # The crash happened outside the Java Virtual Machine in native code.
May 19 05:58:53 sdpl-adamg sh[20951]: # See problematic frame for where to report the bug.
May 19 05:58:53 sdpl-adamg sh[20951]: #
May 19 05:58:53 sdpl-adamg sh[20951]: Nucleus exit at code: 134
May 19 05:58:53 sdpl-adamg sh[20951]: Nucleus exited 134. Retrying 3 times
May 19 05:58:53 sdpl-adamg systemd[1]: greengrass.service: Main process exited, code=exited, status=134/n/a
May 19 05:58:53 sdpl-adamg systemd[1]: greengrass.service: Failed with result 'exit-code'.

To Reproduce Possibly ARMv8 nvidia based device as its working in Processor rev 1 and not for processor rev 0, but still not sure

Expected behavior Greengrass should run without exceptions and appear in AWS management console, (Greengrass) as core device.

Actual behavior After running setup, the thing device and thing group with thing deployment are created, can be seen in AWS IoT section, but not appearing in Greengrass core devices. No deployment is detected keeps on displaying no active deployment found.

Environment OS: Ubuntu 18.04 (Tegra by nvidia) (aarch64)

JDK version I tried both, 11 and 21 and amazon version as well as default-jdk openjdk 11.0.23 2024-04-16 LTS OpenJDK Runtime Environment Corretto-11.0.23.9.1 (build 11.0.23+9-LTS) OpenJDK 64-Bit Server VM Corretto-11.0.23.9.1 (build 11.0.23+9-LTS, mixed mode)

CPU Info (e.g. cat /proc/cpuinfo) processor : 0 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

processor : 1 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

processor : 2 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

processor : 3 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

processor : 4 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

processor : 5 model name : ARMv8 Processor rev 0 (v8l) BogoMIPS : 62.50 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp CPU implementer : 0x4e CPU architecture: 8 CPU variant : 0x0 CPU part : 0x004 CPU revision : 0 MTS version : 54811859

Looking forward for the solution of the problem.

junfuchen99 commented 5 months ago

Hello, could you try installing Nucleus version 2.12.4?

We suspect there is an issue with Nucleus v2.12.5 and we are actively working on it.

Zedstron commented 5 months ago

I am very glad after hearing from you I will definitely try/downgrade old version and will soon update here.

Zedstron commented 5 months ago

Closing this issue as per highlighted by you, downgrading the Greengrass version works.

aws-kevinrickard commented 5 months ago

Hello, We have released Nucleus v2.12.6 which contains the fix for this specific issue.