metanorma / metanorma-docker

Docker container for running the Metanorma toolchain
https://www.metanorma.com
5 stars 3 forks source link

Java crash/hang on x86_64 container running on Apple M1 (was: Crash at Jing when using metanorma/mn image) #126

Closed ronaldtse closed 2 years ago

ronaldtse commented 2 years ago

Is this a Jing problem or a JRE problem?

This document is mn-samples-ogc/sources/style-sample.

$ docker run -v $(pwd):/metanorma --platform linux/amd64 metanorma/mn metanorma --agree-to-terms document.adoc 
[relaton] Info: detecting backends:
[relaton-ogc] ("OGC 06-121r9") fetching...
[relaton-ogc] ("OGC 08-131r3") fetching...
[relaton-ogc] ("OGC 08-131r3") found 08-131r3
[relaton-ogc] ("OGC 06-121r9") found 06-121r9
AsciiDoc Input: (ID _bibliography): Section not marked up as [bibliography]!
Metanorma XML Style Warning: (XML Line 000284): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000395): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000460): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000666): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000760): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000988): Hanging paragraph in clause
Metanorma XML Style Warning: (XML Line 000121): Table should have title
Metanorma XML Style Warning: (XML Line 000356): Table should have title
Jing failed with error: #
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000004013f2eb01, pid=58, tid=61
#
# JRE version: OpenJDK Runtime Environment (11.0.13+8) (build 11.0.13+8-post-Debian-1deb11u1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.13+8-post-Debian-1deb11u1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 86 c1 java.lang.String.substring(II)Ljava/lang/String; java.base@11.0.13 (58 bytes) @ 0x0000004013f2eb01 [0x0000004013f2eac0+0x0000000000000041]
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
[thread 73 also had an error]
# An error report file with more information is saved as:
# /metanorma/hs_err_pid58.log
Compiled method (c1)    2486   86       3       java.lang.String::substring (58 bytes)
 total in heap  [0x0000004013f2e890,0x0000004013f2f368] = 2776
 relocation     [0x0000004013f2ea08,0x0000004013f2eaa8] = 160
 main code      [0x0000004013f2eac0,0x0000004013f2f100] = 1600
 stub code      [0x0000004013f2f100,0x0000004013f2f160] = 96
 metadata       [0x0000004013f2f160,0x0000004013f2f198] = 56
 scopes data    [0x0000004013f2f198,0x0000004013f2f270] = 216
 scopes pcs     [0x0000004013f2f270,0x0000004013f2f350] = 224
 dependencies   [0x0000004013f2f350,0x0000004013f2f358] = 8
 nul chk table  [0x0000004013f2f358,0x0000004013f2f368] = 16
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   https://bugs.debian.org/openjdk-11
#
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted
ronaldtse commented 2 years ago

This exact issue is also reported here:

Appears to be an OpenJDK problem with Rosetta running on Apple Silicon (I'm on it).

I re-ran with

$ docker run -v $(pwd):/metanorma --platform linux/amd64 metanorma/metanorma metanorma --agree-to-terms document.adoc 

And it worked.

I'm not sure if it's a one-off or repeatable issue.

ronaldtse commented 2 years ago

I'm not sure if it's a one-off or repeatable issue.

It is a one-off. And it's somewhat random.

The java process runs but then hangs and runs seemingly forever (I clearly did not wait that long).

Apparently this is a known issue with x86-64 Docker images crashing Java when run through Rosetta on M1:

Running the image with no args should print out some help text and then exit, but instead it just hangs, with activity monitor showing 100% CPU usage by qemu-system-aarch64. I'm fairly certain the same image used to work fine with macOS 11 & an older version of Docker 4.x on the same mac, but I don't know the specific versions that worked.

The solution is said to be this: https://github.com/raxetul/alpine-s6-nginx-php/blob/master/.github/workflows/docker-publish.yml

Build your images with multiarch support to get rid of all possible architecture failures in the future. To do this cleanly, avoid using anything related to the platform in your Dockerfile, just old-school Dockerfiles are ok.

ronaldtse commented 2 years ago

Now I remember -- I just upgraded Docker this morning. Prior to this update, it worked.

ronaldtse commented 2 years ago

Relates to #123

CAMOBAP commented 2 years ago

Just checked on M1 but wan't managed to reproduce the original issue

I agree that we should move to docker/build-push-action@v2 action, anything else should be done in the scope of this task?

ronaldtse commented 2 years ago

@CAMOBAP on my machine, I am running latest Docker (4.4.2), and still have this same issue. The container hangs at:

Metanorma XML Syntax: (XML Line 000857:50): element "fn" not allowed here; expected the element end-tag, text or element "annotation", "callout" or "note"
Metanorma XML Syntax: (XML Line 001102:124): element "requirement" not allowed here; expected element "dl", "figure", "formula", "name", "ol", "p", "quote", "sourcecode" or "ul"
Metanorma XML Syntax: (XML Line 001106:75): element "recommendation" not allowed here; expected element "dl", "figure", "formula", "name", "ol", "p", "quote", "sourcecode" or "ul"
Metanorma XML Syntax: (XML Line 001110:72): element "requirement" not allowed here; expected element "dl", "figure", "formula", "name", "ol", "p", "quote", "sourcecode" or "ul"
Metanorma XML Syntax: (XML Line 001113:25): element "example" incomplete; expected element "dl", "figure", "formula", "name", "ol", "p", "quote", "sourcecode" or "ul"
java -Xss5m -Xmx2048m -jar /usr/local/bundle/gems/mn2pdf-1.38.1/lib/../bin/mn2pdf.jar --xml-file "/metanorma/document.presentation.xml" --xsl-file "/usr/local/bundle/gems/metanorma-ogc-2.0.3/lib/isodoc/ogc/ogc.standard.xsl" --pdf-file "/metanorma/document.pdf" --font-manifest "/tmp/fontist_locations20220211-1-9qa28x.yml"

It just hangs and never finishes.

When I "Ctrl-C" the process, this is shown:

^C/usr/local/bundle/gems/metanorma-1.4.4/lib/metanorma/worker_pool.rb:26:in `join': Interrupt
    from /usr/local/bundle/gems/metanorma-1.4.4/lib/metanorma/worker_pool.rb:26:in `map'
    from /usr/local/bundle/gems/metanorma-1.4.4/lib/metanorma/worker_pool.rb:26:in `shutdown'
    from /usr/local/bundle/gems/metanorma-1.4.4/lib/metanorma/compile.rb:158:in `process_exts'
    from /usr/local/bundle/gems/metanorma-1.4.4/lib/metanorma/compile.rb:37:in `compile'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/compiler.rb:45:in `compile_file'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/compiler.rb:30:in `compile'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/compiler.rb:35:in `compile'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/command.rb:241:in `compile_document'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/command.rb:47:in `block in compile'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/command.rb:47:in `each'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli/command.rb:47:in `compile'
    from /usr/local/bundle/gems/thor-1.0.1/lib/thor/command.rb:27:in `run'
    from /usr/local/bundle/gems/thor-hollaback-0.2.1/lib/thor/hollaback.rb:68:in `run'
    from /usr/local/bundle/gems/thor-1.0.1/lib/thor/invocation.rb:127:in `invoke_command'
    from /usr/local/bundle/gems/thor-1.0.1/lib/thor.rb:392:in `dispatch'
    from /usr/local/bundle/gems/thor-1.0.1/lib/thor/base.rb:485:in `start'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/lib/metanorma/cli.rb:34:in `start'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/exe/metanorma:25:in `block in <top (required)>'
    from /usr/local/bundle/bundler/gems/metanorma-cli-8295884e88cb/exe/metanorma:43:in `<top (required)>'
    from /usr/local/bundle/bin/metanorma:23:in `load'
    from /usr/local/bundle/bin/metanorma:23:in `<main>'

Did you use the same command I showed?

ronaldtse commented 2 years ago

@CAMOBAP in any case using the buildx action will resolve this problem. Can we proceed? Thanks.

ronaldtse commented 2 years ago

When using this command I got the same failure:

$ docker run -v $(pwd):/metanorma --platform linux/amd64 metanorma/metanorma metanorma --agree-to-terms document.adoc
...
java -Xss5m -Xmx2048m -jar /usr/local/bundle/gems/mn2pdf-1.38.1/lib/../bin/mn2pdf.jar --xml-file "/metanorma/document.presentation.xml" --xsl-file "/usr/local/bundle/gems/metanorma-ogc-1.5.5/lib/isodoc/ogc/ogc.standard.xsl" --pdf-file "/metanorma/document.pdf" --font-manifest "/tmp/fontist_locations20220211-1-194yetp.yml"
[mn2pdf] Fatal: mn2pdf 

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00000040144d34e0, pid=118, tid=124
#
# JRE version: OpenJDK Runtime Environment (11.0.13+8) (build 11.0.13+8-post-Debian-1deb11u1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.13+8-post-Debian-1deb11u1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 927 c1 java.lang.String.compareTo(Ljava/lang/String;)I java.base@11.0.13 (63 bytes) @ 0x00000040144d34e0 [0x00000040144d34a0+0x0000000000000040]
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /metanorma/hs_err_pid118.log
Compiled method (c1)    6547 1411       3       org.apache.xalan.processor.XSLTElementProcessor::setPropertiesFromAttributes (506 bytes)
 total in heap  [0x0000004014614a90,0x000000401461db50] = 37056
 relocation     [0x0000004014614c08,0x00000040146152c0] = 1720
 main code      [0x00000040146152c0,0x000000401461b520] = 25184
 stub code      [0x000000401461b520,0x000000401461b7a0] = 640
 oops           [0x000000401461b7a0,0x000000401461b7c8] = 40
 metadata       [0x000000401461b7c8,0x000000401461b978] = 432
 scopes data    [0x000000401461b978,0x000000401461ca70] = 4344
 scopes pcs     [0x000000401461ca70,0x000000401461d7c0] = 3408
 dependencies   [0x000000401461d7c0,0x000000401461d808] = 72
 handler table  [0x000000401461d808,0x000000401461d988] = 384
 nul chk table  [0x000000401461d988,0x000000401461db50] = 456
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   https://bugs.debian.org/openjdk-11
# Preparing...
Input: XML (/metanorma/document.presentation.xml)
Input: XSL (/usr/local/bundle/gems/metanorma-ogc-1.5.5/lib/isodoc/ogc/ogc.standard.xsl)
Output: PDF (/metanorma/document.pdf)

qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted
CAMOBAP commented 2 years ago

@ronaldtse to be on the same page, which document have you tried?

ronaldtse commented 2 years ago

This document is mn-samples-ogc/sources/style-sample.

@CAMOBAP .

ronaldtse commented 2 years ago

Again this only happens on Apple M1, running a linux/amd64 container (because we don't have linux/arm64 container)

ronaldtse commented 2 years ago

This should be fixed once #123 is completed.

ronaldtse commented 2 years ago

This issue cannot be fixed -- this is a Docker issue with running x86_64 containers on Apple M1. Closing.