turnkeylinux / tracker

TurnKey Linux Tracker
https://www.turnkeylinux.org
68 stars 16 forks source link

MongoDB v5.0 requires CPU AVX instructions #1724

Open JedMeister opened 2 years ago

JedMeister commented 2 years ago

I was just testing our MongoDB v17.0 release and Mongo wouldn't start. In fact i can't even read it's help or check it's version...:

root@mongodb ~# mongod --help
Illegal instruction
root@mongodb ~# mongod --version
Illegal instruction

After a bit of searching, I discovered a bug that sums up the situation. A comment from one of the devs notes that v5.0+ has specific CPU requirements.

Apparently, it could be recompiled to allow v5.0 to run on older x86_64 CPUs, although I'm not keen on doing that.

Another option is to stick with v4.4 for now?

I guess another option would be to provide 2 appliances, but 2 things; firstly I don't ahve a compatible CPU and I don't think that mongodb is popular enough for that. My inclination is to stick with v4 for now. We could perhaps have a script to upgrade?

FWIW on Linux to check if your CPU can run v5:

grep flags -m1 /proc/cpuinfo | grep avx

(FWIW, that will only check the first CPU core, but AFAIK, that's enough).

qq7 commented 2 years ago

Apparently, I have AVX available with my CPU, so I couldn't notice this while testing.

I don't think that 5.0 is conceptually different from 4.4, although there were code compatibility changes. So, I think that sticking with 4.4 could actually be better for now, especially for those who would upgrade from 16.0.

Maybe there's a non-AVX-compatible official build released later on?

JedMeister commented 2 years ago

Apparently, I have AVX available with my CPU, so I couldn't notice this while testing.

Totally understandable mate.

Actually, I've just had a closer look and realised that AVX has been around for quite a while and is pretty common (I was under the impression that it was newish). It turns out that my 10 year old laptop supports AVX. Just not my (much newer, but low power Atom based) server...

I think that sticking with 4.4 could actually be better for now, especially for those who would upgrade from 16.0.

Yep, I think that might be best for v17.0.

Maybe there's a non-AVX-compatible official build released later on?

Maybe. But even if there isn't, we will move to it at some point (maybe v18?). And I'll either need to just test it on my laptop, or get a new server! :smile:

blastbeng commented 1 year ago

I don't understand why mongo moved on a "model" that requires AVX, this breaks so many installations of mongo

In my noobish and honest opinion, mongo must have a "cheap, slow and potato" version built without AVX that allows it to be runned under a potato cpu, like the one on my udoo x86 ultra (a 6/7 years old server).

udoo x86 ultra is still powerfull enough to be used as an home server.

JedMeister commented 1 year ago

We might look at packaging it ourselves (without the AVX requirement). I'm sure it runs a bit better/faster/more efficiently when it uses AVX, but at least then it'd run "everywhere".

OnGle commented 1 year ago

It depends though whether AVX is required because it's being passed via -mavx + optimizations or because specific assembly/intrinsics are used that require those exact instructions. The former is easy to handle, the later is not.

JedMeister commented 1 year ago

It's been a while since I researched this, but IIRC the rationale for requiring AVX is that it's more performant and whilst they're open source, they're very much a business (so targets business customers). I'm not sure if you recall, but the reason why we need to use the upstream package is because Debian dropped support for it once upstream changed the licence (Debian consider the current MongoDB licence "non-free"). We moved to the upstream package because by our (looser) position on licensing means it's still open source enough for us (their updated licence discourages 3rd parties from providing MongoDB as SaaS).

I haven't delved any deeper into this recently, but I do hope to look a bit closer sometime soon. I am thinking that building it ourselves (without AVX) could be a workable solution. Although for that to work and remain reliable, we'll need to automate the package building - otherwise it will create undue overhead and risk users being stranded on an insecure version in the future (until I update the package). So that would need to be in place prior to releasing a MongoDB app with a version built by us.

OnGle commented 1 year ago

To expand on the context of my previous comment, if it's only required, because it's compiled with -mavx + optimizations we only need to remove -mavx and then it'll "just work" which we could probably automate easy enough.

If it's been done with intrinsics or assembly then it'd just need to be re-written, either to avoid vectorization entirely or to port AVX vectorization to SSE family extensionss which may not even be possible depending on which functionality it uses explicitly but even if it is, it's entirely unrealistic for us to maintain.

OnGle commented 1 year ago

Bigger issue than AVX is it requiring a newer C compiler & linker than we have available currently. Defering to next major release (we'll provide older mongodb until then)

GermanAizek commented 1 year ago

@JedMeister easiest solution is manually fix SConstruct, change ['+sandybrige'] to [] or take my patch ready here and build. https://github.com/GermanAizek/mongodb-without-avx

@blastbeng I agree with you. The strangest decision mongo maintainers is that there are no AVX instructions on modern Tremont microarchitecture, but it has servers and is used in cellular base stations. https://en.wikipedia.org/wiki/Tremont_(microarchitecture)

image

image

JedMeister commented 1 year ago

Thanks for providing a little more info @GermanAizek. Armed with your patch, it looks fairly straight forward to build.

FWIW my (~5yo Supermicro ITX) server has an embedded Atom that doesn't support AVX.

raxetul commented 11 months ago

Which AVX instructions exactly? (group name would be enough, not each instruction) AVX512?

EliasSantiago commented 10 months ago

I used image: bitnami/mongodb:4.4 to solve this problem in my docker-compose.

slonopotamus commented 10 months ago

Which AVX instructions exactly? (group name would be enough, not each instruction) AVX512?

Definitely not AVX512. MongoDB either needs AVX1 or AVX2. I think it is AVX1 (identified as just avx in /proc/cpuinfo).

devZer0 commented 6 months ago

why is this still open when it is clear , that AVX is a requirement for mongodb 5.0 ?

https://www.mongodb.com/docs/manual/administration/production-notes/#x86_64

JedMeister commented 6 months ago

@devZer0 - that requirement is only for the package they provide. The code itself has no such requirement and as noted previously in this thread, it's possible to compile from source without AVX support - it's just not clear yet whether that's something we want to take on, or whether we just build the appliance with the upstream package (and inherit the AVX requirement).

FWIW since my last comment I've built a debian mongodb package (without AVX support) of 5.0 from source and whilst it works, it takes hours to build and the binary is pretty big. Poking around I managed to marginally reduce it, but it's still too big to distribute IMO. So at this point, I'm not happy to commit to doing that. But that may change.

OTOH AVX is pretty common, and the upstream mongodb AVX requirement has become more well known (i.e. as you demonstrate most people now think that "it is clear , that AVX is a requirement for mongodb 5.0"). As such, I'm nowhere near as concerned about TurnKey users piling in with bug reports about mongodb not working as I was when I opened this issue.

As also hinted earlier in this thread, part of the issue (why we don't just install from upstream apt repo already) is that the (low power Supermicro) server I currently use for testing doesn't support AVX - so I can't test mongodb (when installed from upstream) with my current hardware and workflow. I will be adding some higher power hardware soon(not specifically for mongodb - but that will be a bonus), so it's likely that once I do that, we'll just use the upstream package.

Bottom line, we'll probably end up just biting the bullet and include the version (which requires AVX) from the upstream apt repo, but until the issue is resolved, this issue stays open to track it.

Coriolan-Bataille commented 6 months ago

@JedMeister Same issue here, if by any chance you successfully get a binary small enough to distribute, please share it with us! I'm sure there is tones of folks like us around

blunden commented 4 months ago

@JedMeister Roughly how large did the binary end up being when you compiled it without AVX support? Just curious. 🙂

I'm also running an Atom based server without AVX so I'm curious about the feasibility of building my own packages, or whether I'm better off staying on mongodb 4.4 for now and move the newer hardware in the future.

JedMeister commented 4 weeks ago

@blunden - apologies for slow response. IIRC was 1GB+, I vaguely recall it being multiple GB but that have been my first default build (there are some options you can set to make it skip symbols etc).

If you want to run MongoDB 5.x locally, are happy to compile it, are not limited by disk space and are happy to have it take a day +/- to compile, then it's probably fine. I can't speak for MongoDB versions new than 5 though.

blunden commented 4 weeks ago

@JedMeister Wow, ok. That's a lot larger than I expected. 🙂

I guess I'll stay on 4.4 for now and eventually replace the current hardware.

GermanAizek commented 4 weeks ago

@JedMeister @blunden it took me less than 2 hours to build, but I have RAID 0 Samsung 4 SSDs to speed up compilation of small files 4 kb 8 kb 16 kb and etc. by 4 times, and two-socket configuration E5-2699 v4. I can create repo with binary compiled files different versions mongodb 5.x to 6.x, but repository size will be too large. I don't remember what maximum size is for a public repository.

GermanAizek commented 4 weeks ago

As for reducing size binary files, I'm working with compile flags, I haven't found a way to reduce it, maybe then you need to fixes sources themselves.