WebAssembly / wasi-libc

WASI libc implementation for WebAssembly
https://wasi.dev
Other
836 stars 199 forks source link

dlmalloc has license incompatible with Fedora policies #319

Open khardix opened 2 years ago

khardix commented 2 years ago

Hi all, this is an inquiry more that an issue: Is there any way to use this libc without dlmalloc?

Recently, Fedora removed approval for CC0-licensed code, generally due to possible patent trouble with such code. I do not really think that will ever happen here (although better safe than sorry); however, I would still like to comply with the policy when packaging.

I was so far unsuccessful in trying to build wasi-libc with any of the included muls malloc implementations, and I'm not sure that's possible without having mmap.

What can be done? Is there any other malloc implementation that could be used? Can I use the mmap-emulation code and some of the musl malloc(s) on top of it, or is that a bad idea?

Any help greatly appreciated.

TerrorJack commented 2 years ago

mimalloc also claims to work with wasm32. I suppose it's possible to vendor mimalloc in this tree and have some degree of testing, and there's preliminary work on allowing building without dlmalloc.

sunfishcode commented 2 years ago

The mmap emulation code works by calling malloc, so it unfortunately won't work for implementing a malloc.

As @TerrorJack mentions, there is a build parameter for changing the malloc implementation, though the current options are dlmalloc or none.

mimalloc is a possibility. We could certainly add that to the wasi-libc tree as an option, if someone wanted to port it. It'd also be something we could consider making the default allocator, though we'd want code size and performance comparisions. Also, I am concerned about the fact that mimalloc uses spin-wait loops for some things, though I haven't studied it in depth.

Another option would be to ask Fedora for an exception for dlmalloc. It's used in a lot of places.

kripken commented 2 years ago

Another option is emmalloc. Like dlmalloc it uses sbrk, and it's used in wasm today in emscripten as a drop-in dlmalloc replacement. It should just work in wasi-libc (as long as you don't #define the emscripten verbose logging code). License is MIT (compatible with wasi-libc).

emmalloc is, like wee_alloc, more focused on size than speed, so it's not as fast as dlmalloc on heavy malloc benchmarks. But in general usage it works very well in our experience, and it is significantly smaller than dlmalloc, about 1/3 the size (which is often significant in small programs).

As an additional benefit this would be a nice step towards convergence of wasi-libc and emscripten's libc code.

fweimer-rh commented 2 years ago

Older versions of dlmalloc also come with an actual PD dedication.

khardix commented 2 years ago

@sunfishcode

Another option would be to ask Fedora for an exception for dlmalloc. It's used in a lot of places.

I can certainly do that, but that assumes I asked the project their opinions about changing the implementation and got rejected, told it's a WIP etc. It was one of the reasons for opening this issue ;) However, seeing now that there are options, I would like to explore them more.

@kripken

Another option is emmalloc. Like dlmalloc it uses sbrk, and it's used in wasm today in emscripten as a drop-in dlmalloc replacement.

As an additional benefit this would be a nice step towards convergence of wasi-libc and emscripten's libc code.

Interesting, and I'll try to import this to my fork and play around with it. No experience with benchmarking mallocs etc., so an actual PR may be a bit off, but let's try it.

@fweimer-rh

Older versions of dlmalloc also come with an actual PD dedication.

Also an option, but I would rather not downgrade… I tried to reach Doug Lea with the problem description in order to see if it can be re-dedicated, but got no answer so far.

bjorn3 commented 2 years ago

emmalloc is, like wee_alloc, more focused on size than speed, so it's not as fast as dlmalloc on heavy malloc benchmarks. But in general usage it works very well in our experience, and it is significantly smaller than dlmalloc, about 1/3 the size (which is often significant in small programs).

wee_alloc has memory fragmentation issues and has a relatively high lower bound on memory usage as I understand it. See for example https://github.com/rustwasm/wee_alloc/issues/85.

Edit: you were suggesting emmalloc, not wee_alloc. My bad.

kripken commented 2 years ago

@bjorn3 That looks like a specific bug in wee_alloc, yeah.

We're not aware of any significant bugs on emmalloc atm. It is used in production in both large and small applications, using large and small amounts of memory.

But it is true that emmalloc (and other small allocators like wee_alloc) is definitely not as fast as dlmalloc on malloc-intensive benchmarks - dlmalloc has a lot of nice optimizations particularly for allocating huge numbers of small objects of odd sizes. Those situations are somewhat rare, though, in real-world code, in my experience, likely since developers generally try to avoid tons of malloc calls.

sunfishcode commented 1 year ago

As an update here, emmalloc is now in wasi-libc (#340). It currently requires a wasi-libc build-time option, MALLOC_IMPL=emmalloc.

jedisct1 commented 1 year ago

We switched to emmalloc as the default allocator in zig cc. The dlmalloc code is not shipped any more.

bjorn3 commented 1 year ago

@khardix IANAL, but dlmalloc is more than 20 years old (earliest version I could find is v2.5 from 1993, or in other words 29 year old). This means that any patents that could apply to dlmalloc have since expired and any patents filled after that could be invalidated using dlmalloc as prior art, right? Or am I misunderstanding how patent law works? Now there have been releases that are less than 20 years old, but using any 20+ year old release should be fine with respect to patents, right?

khardix commented 1 year ago

@bjorn3 Unfortunately, IANAL also :) The patents may very well be expired and even dlmalloc be fine. However, from the Fedora PoV, this is still inclusion of new (IOW not previously included in Fedora) code with a problematic license and would need a review and exception from people that actually are lawyers.

Given the age of the code, I would probably get the exception – note however that I would not know about the expiration possibility if I did not open this issue :) Now, with the emmalloc porter, this discussion is probably just academic – I can (and will) use the malloc with non-problematic license.

With that, I'm now considering my issue to be addressed, so feel free to mark it resolved. Thanks for taking the time to do the port!

fweimer-rh commented 1 year ago

IANAL, but dlmalloc is more than 20 years old (earliest version I could find is v2.5 from 1993, or in other words 29 year old).

There have been algorithm changes in dlmalloc much later than that.

The issue is a formal matter, we not actually expect any patent enforcement action from this particular upstream. But we don't really to divide upstreams into those who can use slightly out-of-policy licenses, and those who cannot.