DataDog / dd-trace-js

JavaScript APM Tracer
https://docs.datadoghq.com/tracing/
Other
647 stars 306 forks source link

dd-native-metrics-js will not function on Centos 7 systems due to using an image with newer (and incompatible) version of libstdc++ library to build #2111

Closed theJC closed 2 years ago

theJC commented 2 years ago

dd-native-metrics-js does not function on Centos 7 systems due to the version of libstdc++ (looks to be GLIBC 3.4.21 or 3.4.22) used by the image currently used for building.

dd-native-metric-js's build is using the oldest Node 12.0.0 image, and compiles using the glibc that happens to be on that image (Debian GNU/Linux 9 (stretch)) See: https://github.com/DataDog/dd-native-metrics-js/blame/main/.github/workflows/build.yml#L19-L21

I believe the proper way to resolve this is to change the build image to start with a base of a Centos7 image, then installing the oldest version of Node your product needs to support, thus the glibc library versions used to build the native binaries will support a larger set of target environments.

Expected behaviour dd-native-metrics-js should function on Centos 7 systems, which has maintenance support updates until 2024-06-30

Actual behaviour Error: /lib64/libstdc++.so.6: version 'GLIBCXX_3.4.21' not found (required by /mnt/kubernetes/sandbox/example-tracing-app/node_modules/@datadog/native-metrics/prebuilds/linux-x64/node-83.node

Steps to reproduce On Centos7, run any sample tracing app with native metrics enabled, see logged error that occurs when metrics.js does a require('@datadog/native-metrics')

Environment

rochdev commented 2 years ago

I had done something similar for Node 8 with this image. We could do the same thing for Node 12.

theJC commented 2 years ago

That sounds like it would solve the problem.

Additional context, with Node 12 falling out of LTS, going forward our company supports Node 14 & Node 16 only and gc-stats npm library which we had been using has been abandoned and doesn't work on 16 so we wanted to transition to using datadog native metrics to get similar metrics as well as the benefits of single vendor, and metrics used for drilldowns in the APM UI experience, etc.

We are actively planning on the plan for migration to something newer then Centos7 for our build and deploy base image, but it will be quite awhile until we do and complete that migration. If the build could transition to an image like you mentioned, we would get significant lift and mileage out of this improvement.

theJC commented 2 years ago

@rochdev is there something I can do to help move this forward, or is this something you want to do since it looks like you may have fought this battle before ;)

rochdev commented 2 years ago

@theJC I'll try to take some time to see if I can do something similar for Node 12/14 at the end of the month with the oldest OS I can find.

rochdev commented 2 years ago

@theJC Did some testing yesterday and I think the base image from holy-node-box should work as it comes with GLIBCXX 3.4.17. It does however also come with GLIBC 2.19 when Centos 7 uses 2.17, but since Node is C++ and not C I'm not sure whether this would be a problem.

is there something I can do to help move this forward

I could use help to convert this Dockerfile to Centos 7. The original holy-build-box project actually uses Centos 7 since it has the oldest versions of everything, but when I created the Node version I went with Ubuntu as that's what I'm most familiar with. I think switching to Centos makes the most sense as you pointed out.

rochdev commented 2 years ago

Looks like there is a Development Tools package that comes with everything, but it installs GCC 4.8.5 when I was using 4.7 for Ubuntu, not sure if that's a problem either.

rochdev commented 2 years ago

I wasn't able to make the build work at all on Centos, but I tried an old version of the tracer built on holy-node-box with Node 8 and it runs and sends runtime metrics correctly. This means that the GLIBC difference doesn't matter and only GLIBCXX does, so using Ubuntu for the image is fine. I'll update the action, native-metrics and dd-trace to use that instead.

theJC commented 2 years ago

It may not be pertinent due to your issues getting the build to work on Centos, but FYI I just finished double checking and it appears our Centos7 images used for build and runtime images have the same version as your above inquiry -- gcc version 4.8.5 20150623

theJC commented 2 years ago

Going to do some eyeball tests/checks on your new image just to understand what it provides.

Thanks so much for digging into this, this is much appreciated.

theJC commented 2 years ago

FYI, when I run yum list glibc on our base image we use, it shows: glibc.x86_64 2.17-326.el7_9

theJC commented 2 years ago

On the image built by https://github.com/rochdev/holy-node-box/blob/380ce359a83fc6f22bfea13683bddc139ea32d2b/12/amd64.dockerfile

If I run apt list libc6-dev I get this:

Listing... Done
libc6-dev/trusty-updates,trusty-security,now 2.19-0ubuntu6.15 amd64 [installed,automatic]

If I run dpkg -l libc6-dev I get this:

Name                            Version              Architecture         Description
+++-===============================-====================-====================-====================================================================
ii  libc6-dev:amd64                 2.19-0ubuntu6.15     amd64                Embedded GNU C Library: Development Libraries and Header Files
rochdev commented 2 years ago

It still runs since it seems only GLIBCXX matters, but the problem I'm having now is that Node 15+ won't build on GCC 4.x, so I can't get the newer versions to build. We'll have to split these versions as a separate CI job.

theJC commented 2 years ago

Sorry if the above was noise, but Im trying to understand/map in my mind what maps to GLIBCXX_3.4.19 in the build image (what I get from running strings /lib64/libstdc++.so.6 | grep GLIBCXX on our Centos image)

theJC commented 2 years ago

It still runs since it seems only GLIBCXX matters, but the problem I'm having now is that Node 15+ won't build on GCC 4.x, so I can't get the newer versions to build. We'll have to split these versions as a separate CI job.

Split versions.... making sure I understand, you are saying that for Node 16 and above, we will not be able to have the native metrics library binary built with the lower version libraries? So this effort would only be able to solve the problem for Node 14 but not Node 16 running on older Centos platforms?

rochdev commented 2 years ago

Exactly, because the older distros don't have GCC 6 available which is needed for Node 15+. It's probably possible to get it to install but I'm no apt-get guru so I'll just split it for now.

rochdev commented 2 years ago

@theJC Do you know if Node 16 runs on Centos 7?

theJC commented 2 years ago

I'm fairly sure we have it running on our Centos image, let me go find the Dockerfile to see what if any magic was required.

rochdev commented 2 years ago

Basically I want to figure out if it runs at all, because there is no point in making GCC work if Node itself doesn't even work.

theJC commented 2 years ago

I think I got it working: with some slight adjustments to your Dockerfile.

docker run --rm -it 6cf2b9de61856088fdce77ac677f103b0b93b45c698e06aaf4f63adc715b9ce9
root@b9dbb56e1cd8:/# node -v
v16.0.0
root@b9dbb56e1cd8:/# uname -a
Linux b9dbb56e1cd8 5.10.104-linuxkit #1 SMP Thu Mar 17 17:08:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
root@b9dbb56e1cd8:/# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.6 LTS"
theJC commented 2 years ago

Could probably clean it up a bit, but this is what I did to get the above to work:

https://github.com/rochdev/holy-node-box/compare/master...theJC:getCentosWithNode16Working

rochdev commented 2 years ago

Ok I think I'm close to a complete solution for all version on all architectures. It's not gonna be pretty because I wasn't able to make Centos compile 32-bit for some reason so I went with Ubuntu for 32-bit and Centos for 64-bit, but it should result in binaries that should work everywhere. If you know how to build 32-bit binaries on Centos please let me know.

theJC commented 2 years ago

I've (fortunately?) not had to deal with 32 bit in forever.... sorry, dont have any pertinent experience on that front. For what its worth, my company doesn't use 32 bit, so what was driving me to file this issue wont be effected by not having 32 bit binaries.

rochdev commented 2 years ago

https://github.com/DataDog/action-prebuildify/pull/9

theJC commented 2 years ago

Awesome, the Linux Arm64 side-benefit will benefit us as well as our development fleet of macbooks slowly transitions over to M1/arm64 and we get start getting additional docker images built that are native arm64 linux images.

rochdev commented 2 years ago

Fixed in 2.10.0.

theJC commented 2 years ago

All signs look good with my initial testing on our systems. Thank you very much @rochdev for getting this adjusted!