GoogleCloudPlatform / microservices-demo

Sample cloud-first application with 10 microservices showcasing Kubernetes, Istio, and gRPC.
https://cymbal-shops.retail.cymbal.dev
Apache License 2.0
16.94k stars 7.3k forks source link

Adservice crashing with segfault #2511

Closed BXJC closed 6 months ago

BXJC commented 6 months ago

Describe the bug

Unable to build / deploy on Windows11

To Reproduce

git clone https://github.com/GoogleCloudPlatform/microservices-demo
cd microservices-demo/
minikube start --cpus=4 --memory 4096 --disk-size 32g
skaffold run

Logs

Waiting for deployments to stabilize...
 - deployment/checkoutservice is ready. [10/11 deployment(s) still pending]
 - deployment/adservice: container server terminated with exit code 139
    - pod/adservice-7cccc9b6fc-m7dvh: container server terminated with exit code 139
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # A fatal error has been detected by the Java Runtime Environment:
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] #  SIGSEGV (0xb) at pc=0x00007f0e51906522, pid=1, tid=16
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # JRE version:  (21.0.2+13) (build )
      > [adservice-7cccc9b6fc-m7dvh server] # Java VM: OpenJDK 64-Bit Server VM (21.0.2+13-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64)
      > [adservice-7cccc9b6fc-m7dvh server] # Problematic frame:
      > [adservice-7cccc9b6fc-m7dvh server] # C  [profiler_java_agent.so+0x897522]  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(char const*)+0xc
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # Core dump will be written. Default location: /mnt/wslg/dumps/core.%e.1
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # Can not save log file, dump to screen..
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # A fatal error has been detected by the Java Runtime Environment:
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] #  SIGSEGV (0xb) at pc=0x00007f0e51906522, pid=1, tid=16
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # JRE version:  (21.0.2+13) (build )
      > [adservice-7cccc9b6fc-m7dvh server] # Java VM: OpenJDK 64-Bit Server VM (21.0.2+13-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64)
      > [adservice-7cccc9b6fc-m7dvh server] # Problematic frame:
      > [adservice-7cccc9b6fc-m7dvh server] # C  [profiler_java_agent.so+0x897522]  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(char const*)+0xc
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] # Core dump will be written. Default location: /mnt/wslg/dumps/core.%e.1
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] #
...
      > [adservice-7cccc9b6fc-m7dvh server] /sys/kernel/mm/transparent_hugepage/defrag (defrag/compaction efforts parameter): always defer defer+madvise [madvise] never
      > [adservice-7cccc9b6fc-m7dvh server] Process Memory:
      > [adservice-7cccc9b6fc-m7dvh server] Virtual Size: 146444K (peak: 146444K)
      > [adservice-7cccc9b6fc-m7dvh server] Resident Set Size: 22436K (peak: 22436K) (anon: 13352K, file: 9084K, shmem: 0K)
      > [adservice-7cccc9b6fc-m7dvh server] Swapped out: 0K
      > [adservice-7cccc9b6fc-m7dvh server] /proc/sys/kernel/threads-max (system-wide limit on the number of threads): 126749
      > [adservice-7cccc9b6fc-m7dvh server] /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have): 262144
      > [adservice-7cccc9b6fc-m7dvh server] /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers): 4194304
      > [adservice-7cccc9b6fc-m7dvh server] container (cgroup) information:
      > [adservice-7cccc9b6fc-m7dvh server] container_type: cgroupv1
      > [adservice-7cccc9b6fc-m7dvh server] cpu_cpuset_cpus: 0-7
      > [adservice-7cccc9b6fc-m7dvh server] cpu_memory_nodes: 0
      > [adservice-7cccc9b6fc-m7dvh server] active_processor_count: 1
      > [adservice-7cccc9b6fc-m7dvh server] cpu_quota: 30000
      > [adservice-7cccc9b6fc-m7dvh server] cpu_period: 100000
      > [adservice-7cccc9b6fc-m7dvh server] cpu_shares: 204
      > [adservice-7cccc9b6fc-m7dvh server] memory_limit_in_bytes: 307200 k
      > [adservice-7cccc9b6fc-m7dvh server] memory_and_swap_limit_in_bytes: 307200 k
      > [adservice-7cccc9b6fc-m7dvh server] memory_soft_limit_in_bytes: unlimited
      > [adservice-7cccc9b6fc-m7dvh server] memory_usage_in_bytes: 15000 k
      > [adservice-7cccc9b6fc-m7dvh server] memory_max_usage_in_bytes: 15000 k
      > [adservice-7cccc9b6fc-m7dvh server] kernel_memory_usage_in_bytes: 1352 k
      > [adservice-7cccc9b6fc-m7dvh server] kernel_memory_max_usage_in_bytes: unlimited
      > [adservice-7cccc9b6fc-m7dvh server] kernel_memory_limit_in_bytes: 1436 k
      > [adservice-7cccc9b6fc-m7dvh server] maximum number of tasks: unlimited
      > [adservice-7cccc9b6fc-m7dvh server] current number of tasks: 2
      > [adservice-7cccc9b6fc-m7dvh server] Steal ticks since vm start: 0
      > [adservice-7cccc9b6fc-m7dvh server] Steal ticks percentage since vm start:  0.000
      > [adservice-7cccc9b6fc-m7dvh server] CPU: total 8 (initial active 1)
      > [adservice-7cccc9b6fc-m7dvh server] CPU Model and flags from /proc/cpuinfo:
      > [adservice-7cccc9b6fc-m7dvh server] model name  : 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
      > [adservice-7cccc9b6fc-m7dvh server] flags               : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid movdiri movdir64b fsrm avx512_vp2intersect md_clear flush_l1d arch_capabilities
      > [adservice-7cccc9b6fc-m7dvh server] Online cpus: 0-7
      > [adservice-7cccc9b6fc-m7dvh server] Offline cpus:
      > [adservice-7cccc9b6fc-m7dvh server] BIOS frequency limitation: <Not Available>
      > [adservice-7cccc9b6fc-m7dvh server] Frequency switch latency (ns): <Not Available>
      > [adservice-7cccc9b6fc-m7dvh server] Available cpu frequencies: <Not Available>
      > [adservice-7cccc9b6fc-m7dvh server] Current governor: <Not Available>
      > [adservice-7cccc9b6fc-m7dvh server] Core performance/turbo boost: <Not Available>
      > [adservice-7cccc9b6fc-m7dvh server] Memory: 4k page, physical 307200k(292200k free), swap 4194304k(4194304k free)
      > [adservice-7cccc9b6fc-m7dvh server] Page Sizes: 4k
      > [adservice-7cccc9b6fc-m7dvh server] vm_info: OpenJDK 64-Bit Server VM (21.0.2+13-LTS) for linux-amd64-musl JRE (21.0.2+13-LTS), built on 2024-01-16T00:00:00Z by "admin" with gcc 10.3.1 20211027
      > [adservice-7cccc9b6fc-m7dvh server] END.
      > [adservice-7cccc9b6fc-m7dvh server] #
      > [adservice-7cccc9b6fc-m7dvh server] #
 - deployment/adservice failed. Error: container server terminated with exit code 139.
I0426 14:29:44.134566   28800 request.go:697] Waited for 1.0548337s due to client-side throttling, not priority and fairness, request: GET:https://127.0.0.1:63092/apis/apps/v1/namespaces/default/replicasets?labelSelector=app%3Dshippingservice
1/11 deployment(s) failed

Screenshots

capture.png

Environment

OS: Windows 11 Enterprise 23H2 Kubernetes distribution, version: minikube version: v1.33.0 relevant tool version: Docker Desktop 4.29.0, skaffold: v2.11.0

bourgeoisor commented 6 months ago

@BXJC if I understand correctly, all containers are running fine, except for the adservice?

NimJay commented 6 months ago

@BXJC, thanks for reporting this. :) Could you please try these old adservice images:

You would need to:

BXJC commented 6 months ago

@BXJC, thanks for reporting this. :) Could you please try these old adservice images:

  • gcr.io/google-samples/microservices-demo/adservice:v0.10.0
  • gcr.io/google-samples/microservices-demo/adservice:v0.9.0
  • gcr.io/google-samples/microservices-demo/adservice:v0.8.1
  • gcr.io/google-samples/microservices-demo/adservice:v0.8.0

You would need to:

  - image: adservice
    context: src/adservice

from https://github.com/GoogleCloudPlatform/microservices-demo/blob/main/skaffold.yaml#L47

Thank you for your reply! I attempted to address the issue with the old adservice images and successfully resolved it by utilizing adservice:v0.9.0.

bourgeoisor commented 6 months ago

@NimJay I genuinely think there's an issue with some (but not all) Kubernetes environments with the adservice, which was introduced sometime between we released 0.9.0 and 0.10.0, which we try to pinpoint. Let's keep the current releasing GKE cluster alive since we can reproduce this issue there.

gothboyclick commented 6 months ago

hi guys, i have the same problem on my setup and he's the same of the relator there, for me the solution to change the version of release will make worked, i hope this helping another persons, tks.

bourgeoisor commented 6 months ago

I re-pushed the tags, this crash should not happen with v0.10.0 anymore. May require force-pulling the image if it had already been pulled on your cluster (imagePullPolicy: Always).