Closed bra1nDump closed 1 year ago
A fix that should reduce the load on kroki was deployed, still errors are happening. Possibly people are using kroki too much (likely another service, not this plugin).
Following the tutorial here to deploy docker-compose for kroki https://dev.to/raphaelmansuy/10-minutes-to-deploy-a-docker-compose-stack-on-aws-illustrated-with-hasura-and-postgres-3f6e
Example diagram thats failing on kroki (+ on my ec2 instance), but succeeds locally when ran with docker compose
Looks like its a headless browser related issue. Tail of docker logs for the mermaid kroki container:
[0527/042559.410448:ERROR:gl_surface_egl.cc(852)] EGL Driver message (Critical) eglInitialize: Internal Vulkan error (-7): A requested extension is not supported, in ../../third_party/angle/src/libANGLE/renderer/vulkan/RendererVk.cpp, initialize:1446.
[0527/042559.410729:ERROR:gl_surface_egl.cc(1489)] eglInitialize SwANGLE failed with error EGL_NOT_INITIALIZED
[0527/042559.414129:ERROR:gl_ozone_egl.cc(21)] GLSurfaceEGL::InitializeOneOff failed.
[0527/042559.423874:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization
{"level":"info","time":1685161559550,"pid":1,"hostname":"f49da58099c0","msg":"Chrome accepting connections on endpoint ws://127.0.0.1:42041/devtools/browser/25e0b3f5-eae2-4410-addb-819fb0d7e27d"}
{"level":"error","time":1685161559584,"pid":1,"hostname":"f49da58099c0","stderr":"[0527/042559.541837:ERROR:angle_platform_impl.cc(43)] RendererVk.cpp:127 (VerifyExtensionsPresent): Extension not supported: VK_KHR_surface\n[0527/042559.547160:ERROR:angle_platform_impl.cc(43)] RendererVk.cpp:127 (VerifyExtensionsPresent): Extension not supported: VK_KHR_xcb_surface\n[0527/042559.547261:ERROR:angle_platform_impl.cc(43)] Display.cpp:977 (initialize): ANGLE Display::initialize error 0: Internal Vulkan error (-7): A requested extension is not supported, in ../../third_party/angle/src/libANGLE/renderer/vulkan/RendererVk.cpp, initialize:1446.\n[0527/042559.547342:ERROR:gl_surface_egl.cc(852)] EGL Driver message (Critical) eglInitialize: Internal Vulkan error (-7): A requested extension is not supported, in ../../third_party/angle/src/libANGLE/renderer/vulkan/RendererVk.cpp, initialize:1446.\n[0527/042559.548340:ERROR:gl_surface_egl.cc(1489)] eglInitialize SwANGLE failed with error EGL_NOT_INITIALIZED\n[0527/042559.548416:ERROR:gl_ozone_egl.cc(21)] GLSurfaceEGL::InitializeOneOff failed.\n[0527/042559.564969:ERROR:viz_main_impl.cc(186)] Exiting GPU process due to errors during initialization\n","msg":"chrome process"}
{"level":"error","time":1685161559608,"pid":1,"hostname":"f49da58099c0","stderr":"[0527/042559.601763:ERROR:gpu_init.cc(481)] Passthrough is not supported, GL is disabled, ANGLE is \n","msg":"chrome process"}
{"level":"error","time":1685161559658,"pid":1,"hostname":"f49da58099c0","stderr":"[0527/042559.657546:WARNING:dns_config_service_linux.cc(428)] Failed to read DnsConfig.\n","msg":"chrome process"}
New approach - screw docker compose, just deploy containers separately to flyio. Getting an issue where kroki server (java) cant seem to resolve the domain to kroki-mermaid (both hosted on flyio).
{"timestamp":"1685165590217","level":"ERROR","thread":"vert.x-eventloop-thread-1","mdc":{"error_message":"OK","path":"/mermaid/svg/","method":"POST","action":"error","error_code":"500","failure_class_name":"java.net.UnknownHostException","user_agent":"PostmanRuntime/7.32.2"},"logger":"io.kroki.server.error.ErrorHandler","message":"An error occurred","context":"default","exception":"java.net.UnknownHostException: Failed to resolve 'https://kroki-mermaid.fly.dev' [A(1), AAAA(28)] after 3 queries \n\tat io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1088)\n\tat io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1035)\n\tat io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:422)\n\tat io.netty.resolver.dns.DnsResolveContext.onResponse(DnsResolveContext.java:655)\n\tat
When ssh ing into the kroki instance I can successfully curl https://kroki-mermaid.fly.dev
. Must be some java thing?
On the bright side I can just use the mermaid instance to route traffic there
From email to kroki (Guillaume)
Issue now fixed on kroki side.
Mermaid/Puppeteer is known to be "unstable" (https://github.com/yuzutech/kroki/issues/1319). My guess is that, on some occasions, the headless Chrome crashes or gets killed. Unfortunately, I haven't been able to pinpoint exactly why it occurs...
For reference, the first errors on stderr about GPU are expected and can be safely ignored. That's one of the reasons why it's hard to troubleshoot because Chrome/Puppeteer logs are extremely verbose and confusing.
A cleaner solution would be to deploy a kubernetes cluster https://docs.kroki.io/kroki/setup/use-kubernetes/ This way we can route all route to our custom kroki server
Most likely its related to kroki service being overloaded. We are working on a fix