Closed thorsteneb closed 9 months ago
I'm testing a fix for this now, but it looks like lmdb is hitting another error after this:
mtest.c:113: mdb_cursor_get(cursor, &key, &data, MDB_LAST): MDB_NOTFOUND: No matching key/data pair found
I'm trying to determine if this is related to my "fix" or another issue.
The major use case for "mmap's problems" is BerkeleyDB. Instead of deep diving lmdb
(or boltdb
#2304 #3162 or foo-db
) you may consider:
sudo apt install tcl tcl-dev # plus the usual suspects
mkdir bdb && cd bdb
wget http://download.oracle.com/berkeley-db/db-5.3.21.tar.gz
tar xf db-5.3.21.tar.gz
cd db-5.3.21/build_unix
../dist/configure --prefix=/usr/local --enable-compat185 --enable-dbm \
--disable-static --enable-test --enable-debug --enable-cxx \
--enable-tcl --with-tcl=/usr/lib
make -j4
tclsh # in tickle shell after this, most unfortunately
source ../test/tcl/test.tcl
# run_std # worthy goal but it's a big test
# r btree # more reasonable but still a big ask
test042 btree # first random fail (segfault) I hit on WSL, fine on Real Linux
There is some information on running the individual tests here. If you get the BerkeleyDB tests to pass it will squash #3149 #2627 #2852 #2871 (and everything else in the BDB universe). The other foo-db
will almost certainly follow suit.
I guess this is related to #902 ? It sure would be nice to see the mmap issues finally fixed...
@drujd Vote on user voice, get your friends to vote. Right now, there's not exactly an outcry to get BerkeleyDB to work, or any other project relying on these mmap features.
The issue has much broader implications, I have voted, but I do not think this is the right thing to vote on. The dysfunct mmap functionality breaks everything that depends on the correct mmap behavior, including bup or building Android from source. Personally I don't really care about BerkeleyDB.
Also, I really think this should be considered a bug, not a feature request.
Could be a combination of bugs and missing features. I don’t understand WSL even nearly well enough to have an informed opinion on that.
I think it’s very likely that supporting Berkeley DB will also “knock down” a range of other related issues. Any DB with the same pedigree will work.
For Android source development - does that still require 32-bit elf support? If so, the uservoice for that is here: https://wpdev.uservoice.com/forums/266908-command-prompt-console-windows-subsystem-for-l/suggestions/13377507-please-enable-wsl-to-run-32-bit-elf-binaries
@yorickdowne - The distinction of bug vs feature request is fairly arbitrary, we call things bugs that are simple fixes and a feature request would take a lot longer to implement.
I've looked into this and #902 and they share the commonality that both rely on functionality that the Windows kernel does not support which makes the scope of this fix a bit larger.
The dysfunct mmap functionality breaks everything that depends on the correct mmap behavior, including bup or building Android from source.
At base the problem is stated in the SetEndOfFile
function....
If CreateFileMapping is called to create a file mapping object for hFile, UnmapViewOfFile must be called first to unmap all views and call CloseHandle to close the file mapping object before you can call SetEndOfFile.
This has been ~dysfunct~ differently abled for 25 years. Repeating the sentiment here, and reiterating that this guy was right (he just framed the ask incorrectly), fixing this In Windows means you don't need patches like this for lmdb
on Windows (and only for Windows).
The reason Lineage builds don't work is because no one does Lineage builds on Windows. Then someone does an upstream patch like this (which note has no #ifdef _WIN32
because it doesn't need one) and you never hear of it again. Academically, if you look closely at the lmdb
work-around, it could be done in a way that doesn't require an #ifdef
too. The Linux side of the #ifdef
could set the file size to sizehi
instead of 0
and close the handle too. Maybe someone will.
Also, I really think this should be considered a bug, not a feature request.
Alright. But by the chosen definition of not-dysfunct, it is a Windows bug contrast lack of Windows feature too. In extremis, ditto ha ha but serious.
Wow. That rabbit hole went deep fast. Thanks, @therealkenc .
fixing this In Windows
That'd be amazing. Your theory that WSL works because 23 years of Cygwin workarounds means not all code triggers https://github.com/Microsoft/WSL/issues/1529 is -- oddly compelling.
You are saying that allowing SetEndOfFile with handles open would then lend itself to also allowing other actions with handles open, such as renaming, and deleting. Do I get that right?
@benhillis , thanks for looking into it! From your perspective, are allowing SetEndOfFile/delete/rename while handles are open and allowing mapping of a zero-length-file things to be tackled in Windows, or through some form of layer in WSL?
allowing SetEndOfFile with handles open would then lend itself to also allowing other actions with handles open, such as renaming, and deleting. Do I get that right?
Nah I didn't mean to imply that in any narrow sense. Better to call the two issues (#1529 and mmap's problems) analogous. No magic bullet here. [ed] To illustrate the bigger picture, see #2915 (message) which doesn't conflate with Windows' "pinned handles" xplat problem. You could fix it in WSL, but that doesn't help win32 NodeJS being broken. How do you eat an elephant? One bite at a time.
I've looked into this and #902 and they share the commonality that both rely on functionality that the Windows kernel does not support which makes the scope of this fix a bit larger.
@benhillis Is it realistic to tackle this and #902 as part of 20H1?
Edit: this statement is not correct. I can not reproduce this with https://github.com/WhitewaterFoundry/WLE. I'm going to try to do more bisecting here to see if we can figure out what the difference is between working RPM and not working RPM distributions are. My apologies (and thanks to @DarthSpock for pointing this out)
Edit 2: we found that different versions of RHEL have problems. What is interesting is that we were originally trying on an older version, but the folks at https://github.com/WhitewaterFoundry/WLE/issues/20 have reproduced it on a higher version. If I get some time i'll see if I can bisect out the OS versions to prove if it really is that (i.e. a change in RPM or one of its dependencies such as bdb),
--
This behavior has the effect of preventing people running most RPM based distributions such as RHEL/Fedora/CentOS/Scientific Linux under WSL.
Have you tried WSLFedoraRemix?
I would be mildly curious to know whether #2852 (rpm ‑‑initdb
) reproduces on WSLFedoraRemix. That issue never had usable reproduction steps for whatever it was the OP was doing on their Fedora (for some value of Fedora), but rpm --initdb
works on Ubuntu natch. Curious, but not curious enough to actually install WSLFedoraRemix and look. Maybe it's been patched like Alex suggests. Maybe not. Dunno.
It affects WLinux Enterprise distributions (RHEL, SL, CentOS) as well. See https://github.com/WhitewaterFoundry/WLE/issues/20.
@benhillis
mtest.c:113: mdb_cursor_get(cursor, &key, &data, MDB_LAST): MDB_NOTFOUND: No matching key/data pair found
I'm trying to determine if this is related to my "fix" or another issue.
Ben, at the time back in August 2018 did you manage to track that fail any? Understandable if you didn't, but whatever you found (if anything) might save Hayden some effort.
Hey ! It’s starting to makes really long for not being able to update Fedora or install any other packages…
Was hoping for some joy on the OP repro in 18890, but 'fraid not.
233 execve("/usr/bin/make", ["make", "test"], 0x7fffcece5ae0 /* 22 vars */) = 0
[blah blah...]
238 openat(AT_FDCWD, "./testdb/data.mdb", O_RDWR|O_CREAT, 0664) = 4
238 fstatfs(4, {f_type=0x53464846, f_bsize=4096, f_blocks=125688063, f_bfree=84551925, f_bavail=84551925, f_files=999, f_ffree=1000000, f_fsid={val=[1, 0]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOATIME}) = 0
238 pread64(4, "", 152, 0) = 0
238 mmap(NULL, 10485760, PROT_READ, MAP_SHARED, 4, 0) = -1 ENOEXEC (Exec format error)
238 munmap(0x7ffb5ddd0000, 2101248) = 0
I'm testing a fix for this now, but it looks like lmdb is hitting another error after this: mtest.c:113: mdb_cursor_get(cursor, &key, &data, MDB_LAST): MDB_NOTFOUND: No matching key/data pair found
Didn't make it that far. If that initial Aug 2018 fix attempt is still in the can, and doesn't break anything obvious, it might be worth a git push
to see if it gets any further.
This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.
Thank you!
ver
at a Windows Command Prompt)April Update, 10.0.17134.191 Also tested on 19H1, build 18204
The package "liblmdb" does not work on WSL. This impacts any projects that use lmdb. I've narrowed this down to mmap(NULL,...) on a 0-length file failing with ENOEXEC. This is likely related to #658 , yet distinct enough that the @therealkenc suggested I file a new issue, to avoid confusion.
To reproduce with lmdb:
git clone https://github.com/LMDB/lmdb.git cd lmdb/libraries/liblmdb make test
mtest.c:50: mdb_env_open(env, "./testdb", MDB_FIXEDMAP , 0664): Exec format error Aborted (core dumped) Makefile:61: recipe for target 'test' failed make: *** [test] Error 134
strace shows:
mmap(NULL, 10485760, PROT_READ, MAP_SHARED, 4, 0) = -1 ENOEXEC (Exec format error)
Just before it cores.
Sample code showing just the ENOEXEC, but not going on long enough after that to core:
strace of the sample code: mmap-example.strace.txt
strace of liblmdb itself: liblmdb.strace.txt
User voice for this issue lives here: https://wpdev.uservoice.com/forums/266908-command-prompt-console-windows-subsystem-for-l/suggestions/35053207-support-oracle-berkeley-db-and-lmdb