Open CetinSert opened 3 years ago
To investigate this, the first thing I'd do is find out if this is fragmentation or leaking. Fragmentation may be caused when doing many allocations and frees, after which it cannot find contiguous chunks and so ends up growing the total size of memory. dlmalloc mentions that aligned memory may be more vulnerable to that,
To verify that, I'd instrument malloc
, memalign
, and free
(and also realloc
etc. if used). Just counting the bytes allocated and freed. One way to instrument it is to compile with WASM=0
and add some code in JS. Or, use mallinfo()
, see https://emscripten.org/docs/porting/Debugging.html#memory (the memory profiler mentioned there can also help).
If you find that the codebase you are compiling allocates more than it frees, then it would be a leak in the code. If instead it frees everything it allocates, then I'd guess the problem is fragmentation.
If it is fragmentation in fact, then the codebase might benefit from reusing buffers, using an arena, etc.
@kripken โ can increasing the wasmMemory
page size from 64KB
to something higher also help with fragmentation?
I will answer myself: this is not even possible as 64KB
is a constant โ https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/Memory/Memory#parameters.
To verify that, I'd instrument
malloc
,memalign
, andfree
(and alsorealloc
etc. if used). Just counting the bytes allocated and freed. One way to instrument it is to compile withWASM=0
...
Instrumentation seemed easy even without WASM=0
.
(() => {
const ins = (f, n) => (...a) => { const v = f(...a); console.warn(n, f, v, '<-', a); return v; };
const { malloc, memalign, memset, free, realloc } = Module.asm;
Module.asm = { ...Module.asm,
malloc: ins(malloc, 'malloc '),
memalign: ins(memalign, 'memalign'),
memset: ins(memset, 'memset '),
free: ins(free, 'free '),
realloc: ins(realloc, 'realloc ')
};
})();
but the instrumented functions only get called when the now infamous memalign
โฏ
trio triggers. So we get limited improved vision.
@kripken โ updated the original issue with more data from the instrumentation (which does trigger repeatedly on every time as it is a new known bad case and bad cases trigger the memalign
โฏ
pattern in a way (at a layer) that also triggers the instrumentation), please take a look at the new comment linked below to see if it reminds you of any pattern you or other developers of emscripten
recognize.
https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-863587472
Instrumenting Module.asm
like that will just affect calls that go through JS. Direct calls inside the wasm would not be noticed.
If that is not a problem, then the logging does show a leak in the codebase, as there is no free for the third memalign. To investigate that you could get a stack trace for that allocation, then try to figure out where the free should be, etc.
If it is fragmentation in fact, then the codebase might benefit from reusing buffers, using an arena, etc.
I assume there is no way to compact / vacuum the memory.
To investigate that you could get a stack trace for that allocation, then try to figure out where the free should be, etc.
memalign
โฏ
calls to test if codebase still works?memalign(65535, โฏ)
? Should 65535
not be a small number that is more like 4
because WASM is 32-bit
?If that is not a problem, then the logging does show a leak in the codebase, as there is no free for the third memalign. To investigate that you could get a stack trace for that allocation, then try to figure out where the free should be, etc.
The original C++ codebase was checked and found to not suffer from memory leaks:
Here is how em++
is being called: https://github.com/paulo-coutinho/pdfium-lib/blob/c95bc4a72502251bd3ee770d57907812945c6a2d/modules/wasm.py#L674.
Here is how the WASM builds are being made by the repository owner. PDFium
build steps ๐๐ป (dockerized short version)
You can use docker to build and test on local machine before deploy.
Build the image with command:
docker build -t pdfium-wasm -f docker/wasm/Dockerfile docker/wasm
Test with command:
docker run -v ${PWD}:/app -it pdfium-wasm echo "test"
Now you can execute any command with pattern:
docker run -v ${PWD}:/app -it pdfium-wasm [COMMAND]
Obs: This is the recommended way to build and is used on CD server.
You can test the sample using commands:
โฎ
docker run -v ${PWD}:/app -it pdfium-wasm python3 make.py run test-wasm
python3 -m http.server --directory sample-wasm/build
You can test the sample using commands:
โฎ
docker run -v ${PWD}:/app -it pdfium-wasm python3 make.py run test-wasm
docker run -v ${PWD}:/app -it pdfium-wasm node sample-wasm/build/index.js
You can test pdfium on web browser here:
Further improved the instrumentation in https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-863707842 to keep counts and added screenshots.
(() => {
M = new Map();
M.sum = () => [...M.values()].reduce((a, c) => a + c, 0);
M.cnt = { malloc: 0, free: 0, sum: 0 };
M.fn = { malloc: 0, memalign: 1 };
M.add = (n, a) => { M. set(a, n); M.cnt.malloc++; M.cnt.sum += n; };
M.sub = a => { M.delete(a ); M.cnt.free++; };
const ins = (f, n) => (...a) => {
const v = f(...a), d = a[M.fn[n]] ?? 0;
if (d > 0) M.add(d, v);
if (n === 'free') M.sub(a[0]);
console.warn(n.padEnd(8), f, `${v}`.padStart(20), '<-', (a.length == 1 ? [...a, '-'] : a).map(a => `${a}`.padStart(20)).join(' '), '|', `${wasmMemory.buffer.byteLength}`.padStart(20), '|', `${M.sum()}`.padStart(20), 'M', M.size); return v;
};
const { malloc, memalign, memset, free, realloc } = Module.asm;
Module.asm = { ...Module.asm,
malloc: ins(malloc, 'malloc'),
memalign: ins(memalign, 'memalign'/*.length == 8*/),
//memset: ins(memset, 'memset'),
free: ins(free, 'free'),
realloc: ins(realloc, 'realloc')
};
})();
Bitmap_CreateEx
cause repeated memalign
โฏ
calls?495
ร 495
ร 4
triggers memalign
โฏ
reliably.Bitmap_Destroy(Bitmap_CreateEx(โฏ))
calls were tested in C++ by me and Google as linked above and were found to not leak in that environment at any size. (see https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-863656321)em++
generating/placing the memalign
โฏ
calls into the compiled WASM binary?-O3
on which the manual page of em++
reads -O3 As -O2, plus dangerous optimizations that may break the generated
code! This adds
-s FORCE_ALIGNED_MEMORY=1 -s DOUBLE_MODE=0 -s PRECISE_I64_MATH=0
--closure 1 --llvm-lto 1
This is not recommended at all. A better idea is to try each of
these separately on top of -O2 to see what works. See the wiki and
src/settings.js (for the -s options) for more information.
Testing with -O2
seems to not help. Re-confirming with an independent build soon ...
I assume there is no way to compact / vacuum the memory.
Correct. That is a general issue with languages using linear memory. On native platforms this is also a problem, mostly on 32-bit (as on 64-bit virtual memory can be used to work around it).
You can get a stack trace programmatically by console.log(new Error().stack)
for example. Or you can get a trace running in the debugger.
em++ may call memalign in some cases for internal reasons (like if you call mmap
). The stack trace should help figure that out.
You can get a stack trace programmatically by console.log(new Error().stack) for example. Or you can get a trace running in the debugger.
Here is the stack trace.
Thanks, ok, then the allocation happens here in mmap:
and the free should happen here in munmap:
I don't see anything obviously wrong there, and in our test suite the tests pass (and I verified we check that free is called on the right pointer).
Is it possible the codebase you are compiling does not call munmap on what it mmaps?
Is it possible the codebase you are compiling does not call
munmap
on what itmmap
s?
checking now ... (as a note, other target / compiler pairs have no such issues as tested twice here: https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-863656321)
I have also tried other malloc
s emscripten
ships with ... nothing helped there either. (TODO: add data)
Also tried all sanitizer, debug options and the only leaks that get detected is when we run out of 2GB
WASM memory a mere 48
bytes (TODO: add data).
Is it possible the codebase you are compiling does not call
munmap
on what itmmap
s?
Found the culprit I believe:
munmap
on what it mmap
s โ๏ธ (see below)munmap
sometimes (with a regular pattern) fails to call free
โ (see below)munmap
is getting called with a shorter length than how much was mmap
ed to that address.Please review this link where I have done a lot to get as much ready-to-see info out as possible!
https://pdf.ist/em/14459/ (ready to produce the below screenshot)
1
time
_PDFium_Init(); clear();
3
times
_PDFium_Init(); var w = 496, h = 496; for (let i = 0; i < 1; i++) _FPDFBitmap_Destroy(_FPDFBitmap_CreateEx(w, h, 4)); [ wasmMemory, wasmMemory.buffer.byteLength ] // โ
โ๏ธ munmap
calls free
โ๏ธ munmap
calls free
โ munmap
does not call free
Is it possible the codebase you are compiling does not call
munmap
on what itmmap
s?
No, at all 3
times, munmap
is called with the output from mmap
(visible in the screenshot).
โฎ
if (len === info.len) { // โ this check fails on the 3rd call
โฎ
https://pdf.ist/em/14459/ โ further instrumented for your convenience.
function syscallMmap2(addr, len, prot, flags, fd, off) {
โฎ
console.warn('๐ต๐ปโโ๏ธ๐', 'mmap2 ', { addr, len, prot, fd, off }, "โ", ptr, "โ", SYSCALLS.mappings[ptr]);
return ptr;
}
function syscallMunmap(addr, len) {
if ((addr | 0) === -1 || len === 0) {
return -28;
}
var info = SYSCALLS.mappings[addr];
console.warn('๐ต๐ปโโ๏ธ๐', 'munmap', { addr, len }, "โ", { ...info }, { 'len ===': len === info?.len ? 'โ๏ธ' : 'โ' });
โฎ
}
Is it not safe to change this
if (len === info.len)
to this
if (len <= info.len)
assuming emscripten
bookkeeps the memory block from info.malloc
to info.malloc
+
info.len
as reserved anyway?
References
https://pubs.opengroup.org/onlinepubs/9699919799/ (๐ munmap
, mmap
)
https://pubs.opengroup.org/onlinepubs/9699919799/functions/munmap.html
Is the code in question trying to partial unmapping of anonymous regions?
We don't support that, and I don't think there is any easy way to fake it. We should really be asserting in that case. We also don't support mprotect at all. We should assert there too.
Is the code in question doing anonymous mapping or file-backed mappings?
If its just anonymous mappings then I best thing to do is to find a way to tell the upstream code the we don't really support mmap at all. Is there perhaps a HAS_MMAP macro that you can turn off?
@sbc100 โ thank you for the prompt response. I am only blackbox probing the codebase for a personal project.
@paulo-coutinho โ can you please review these issues and discuss them with the PDFium
group like you have done in the past here, https://github.com/paulo-coutinho/pdfium-lib/issues/22 for a way out?
@paulo-coutinho โ https://groups.google.com/g/pdfium/c/cDkppTy9mEc (posted here with references to related issues)
Ok, i will check it.
@kripken does that mean the latest PDFium
can no longer be compiled with em++
without source code modifications?
It does seem like you were involved with a commercial project compiling PDFium
with emscripten
(https://pspdfkit.com/blog/2019/webassembly-emscripten-chat-alon-zakai/). Would you mind sharing with the open source community here what you already know or could learn from that company by asking them so that their limited commercial interests are not the only thing that benefits from the combined power of PDFium
and emscripten
?
(Whether PDFium
actually does partial munmap
when built with other compilers for other targets is not yet confirmed. I have asked one of their developers if they could answer that question for us. Should that not result in a clear answer, I will try instrumenting the C++ codebase by whatever means necessary to get the answer. (Not sure of LD_LIBRARY_PATH
will be of value for munmap
, although I have already used it to replace malloc
, free
in a Linux system successfully.)
For reference, per https://groups.google.com/g/pdfium/c/cDkppTy9mEc/m/2_4sjSdqAwAJ, PDFium
calls both munmap
and mmap
only in 1
file (re-confirmed with source code search tools).
munmap
: https://pdfium.googlesource.com/pdfium.git/+/refs/heads/chromium/4543/third_party/base/allocator/partition_allocator/page_allocator_internals_posix.h#129mmap
: https://pdfium.googlesource.com/pdfium.git/+/refs/heads/chromium/4543/third_party/base/allocator/partition_allocator/page_allocator_internals_posix.h#84The corresponding Windows implementation in โฏ_win.h
โฏ_posix.h
implementation does actually perform partial munmap
semscripten
void* TrimMappingInternal(โฏ
โฎ
// We cannot resize the allocation run. Free it and retry at the aligned
// address within the freed range.
โฎ
}
@kripken โ do you see anything else in https://pdfium.googlesource.com/pdfium.git/+/refs/heads/chromium/4543/third_party/base/allocator/partition_allocator/page_allocator_internals_posix.h that emscripten
/em++
does not implement that we should be aware of?
(It is the only file with posix
in the name in PDFium
code base (except for 2
other files related to file access which do not apply in WASM builds).)
Reading to code briefly it seems like the current codebase has has issues under emscirpten both before and after #14459.
Please tell my if my understanding not correct here: PDFium is always trying to allocate memory with 2Mb alignment (kSuperPageSize). To do this it is calling mmap
with a suggested address (arg0 which is the hint) which it aligns to 2Mb. This hint is completely ignored by emscripten so the result will pretty much always fail to be 2Mb aligned. When the result is not 2Mb aligned PDFium will then over-allocates by 2Mb and then try to trim the result by munmap
ing the regions at the start and the end. This munmap
'ing will silently fail on emscripten prior to #14459 and abort after #14459.
However, one key observation is that when the munmap would previously silently fail pdfium can then never free the aligned result. See TrimMappingInternal:
if (pre_slack) {
int res = munmap(base, pre_slack);
CHECK(!res);
ret = reinterpret_cast<char*>(base) + pre_slack;
}
When mmunamp silently fails here PDFium adjusts the resulting pointer to form a new pointer that is now un-freeable. This is no longer a pointer that can be passed to munmap in the future since the underlying malloc implementation is not tracking it.
I think the ultimate solution here is to patch PDFium such that under emscripten is simply calls posix_memalign(kSuperPageSize, ...);
to get it aligned memory.
I think the ultimate solution here is to patch PDFium such that under emscripten is simply calls
posix_memalign(kSuperPageSize, ...);
to get it aligned memory.
@sbc100 โ umm, where in which file?
References
kSuperPageSize
โ
https://pdfium.googlesource.com/pdfium.git/+/refs/heads/chromium/4543/third_party/base/allocator/partition_allocator/partition_alloc_constants.h
I am trying replacing
void* TrimMappingInternal(void* base,
size_t base_length,
size_t trim_length,
PageAccessibilityConfiguration accessibility,
bool commit,
size_t pre_slack,
size_t post_slack) {
void* ret = base;
// We can resize the allocation run. Release unneeded memory before and after
// the aligned range.
if (pre_slack) {
int res = munmap(base, pre_slack);
CHECK(!res);
ret = reinterpret_cast<char*>(base) + pre_slack;
}
if (post_slack) {
int res = munmap(reinterpret_cast<char*>(ret) + trim_length, post_slack);
CHECK(!res);
}
return ret;
}
with
void* TrimMappingInternal(void* base,
size_t base_length,
size_t trim_length,
PageAccessibilityConfiguration accessibility,
bool commit,
size_t pre_slack,
size_t post_slack) {
void* ret = base;
if (pre_slack || post_slack) {
// We cannot resize the allocation run. Free it and retry at the aligned
// address within the freed range.
ret = reinterpret_cast<char*>(base) + pre_slack;
FreePages(base, base_length);
ret = SystemAllocPages(ret, trim_length, accessibility, PageTag::kChromium,
commit);
}
return ret;
}
as we speak.
Seems to help in the most basic cases such as;
_PDFium_Init(); var s = 496; for (let i = 0; i < 1; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(s, s, 4)); [ wasmMemory.buffer.byteLength ]
but fails when more stuff is happening such as when actually rendering pages:
it tends to hang when attempting a new render after the above error.
I think that the entire TrimMappingInternal
and TrimMapping
should not be included in the emscripten build. Trimming an existing mapping is not really possible with a malloc-backed mmap. I think we should instead just use posix_memalign
to get an allocation that is already aligned (as opposed to using the hint
parameter to mmap
which seems like strictly less precise way to express the intent to the underlying allocator).
In other words, on systems which don't have real mmap
support lets avoid mmap
completely.
@sbc100 โ That would suggest we would need a new file like page_allocator_internals_em.h
. As my C experience is not up to speed to create one, can you or anyone else come up with the necessary changes / a complete example? If you want compensation because this is beyond the scope of emscripten
, please write to me at work@pdf.ist
and we can discuss the conditions. (The resulting work can from my point of view be licensed as BSD3
(the same license the rest of PDFium
uses).)
I don't think I will have the time to add this myself, but I suggest we/you reach out to folks who work on PDFium and see if we can persuade them to add (or at least accept) a no-mmap version of that codepath.
@cetinsert
Thanks for the feedback on that PR. Let's iterate there to find the right solution.
It does seem like you were involved with a commercial project compiling PDFium with emscripten
I think that was just a friendly chat about wasm that I did on their blog. I don't know anything about their codebase.
@kripken โ thank you for the friendly response! I have just learned how to compile the WASM builds without missing a step re-using a stale file and am now testing the โฏ_win.h
approach described here https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-866066268 which was considered worth a try by someone from Google too (https://groups.google.com/g/pdfium/c/cDkppTy9mEc/m/EIK5_FGIAwAJ).
@sbc100 โ trying โฏ_win.h
approach as described here https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-866066268
seems to help in the most basic cases such as;
_PDFium_Init(); var s = 496; for (let i = 0; i < 1; i++) FPDF.Bitmap_Destroy(FPDF.Bitmap_CreateEx(s, s, 4)); [ wasmMemory.buffer.byteLength ]
but fails when more stuff is happening such as when actually rendering pages:
it tends to hang when attempting a new render after the above error.
I don't think that will work. The main reason is that the hint
argument which is the first argument to SystemAllocPages
is ignored under emscripten. As I said, I think you are going to want a more specific, higher level, solution that avoids SystemAllocPages
completely and uses something like posix_memalign
instead. Hopefully this can be done upstream: https://groups.google.com/g/pdfium/c/cDkppTy9mEc.
Any chance of this https://github.com/emscripten-core/emscripten/blob/7192647ca6d3add7291338b735a166fd3409edca/src/library_syscall.js#L282
getting done in emscripten
?
Any chance of this
getting done in
emscripten
?
Since WebAssembly doesn't have access the virtual memory subsystem its hard to see a useful/sensible way to implement this. I suppose these is the possibility of writing a separate page allocator that operates on a different chunk of memory to the malloc heap? But we have no plans to do that. Its also something that could just as well be implemented in user space. i.e. we don't any system primitives that make sense here.
Lei Zhang from PDFium
has signaled willingness in accepting a WASM-only memory management patch upstream: https://groups.google.com/g/pdfium/c/cDkppTy9mEc/m/k976It-QAwAJ
@sbc100 @kripken โ the best I can do (with my active knowledge of C/C++
) is extend this approach https://github.com/emscripten-core/emscripten/issues/14459#issuecomment-866066268 by doing a full translation of โฏ_win.h
and use that for WASM. If you look at the โฏwin.h
linked in that comment, do you still see problems remaining because of hint
or is Windows behaving like WASM in SystemAllocPages
too? Should a different approach be needed altogether, how/where can we find a developer willing to deliver a solution here?
I'm talking to some folks who work on PDFium about getting a upstream patch.
@sbc100 โ hello!
Is anyone making any progress on this? Can you provide us (@paulo-coutinho, @cetinsert) links and contacts where / with whom we can follow up?
Sorry I haven't had any time to look into it. We have approval to make an emscripten-specific version of the page allocator in PDFium but I don't think I will have time to do it myself.
@cetinsert
Do you have knowledge to do it?
@paulo-coutinho โ I have limited experience with memory allocators. The fanciest thing I have ever done is LD_LIBRARY_PATH=
a different malloc
library to replace the glibc
one which had been the cause of incessant segfault
s in a Linux installation that was half-apt upgrade
d; no changes in legacy code, fix working like a charm: no segfault
s ever happening again, no memory corruption either.
It would speed things up if we connect with the right people as any output from me will likely take weeks at best. Perhaps @sbc100 or @kripken could briefly sketch out in pseudo-code or plaintext what needs to be replaced and how so we can give it a try!
@paulo-coutinho โ ffmpeg
was recently ported to WebAssembly (a few months ago); perhaps their community might have had to deal with / get past similar issues.
Hi,
Any solution for this?
Thanks.
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.
Any news on this?
In a project compiled with
emscripten
(em++
) to WASM, calls to a function with some image dimensions keep triggeringrepeatedly on every function call. When the function eventually returns a memory reference, this works as expected when probed with image-related functions for height, width, etc. BUT it cannot be deallocated and leak completely.
Other image dimensions both work and deallocate just fine; calling the function with such good dimensions do NOT keep triggering the above
memalign
โฏ
repeatedly.The behavior persists across
91.0.4472.77
, Firefox89.0
4505
,4542
,4543
2.0.20
,2.0.24
https://github.com/paulo-coutinho/pdfium-lib/commit/d4a08e866b4fe6573447548c863d4dae4050cb18#diff-120a286a77ddf1bc450cb5703c08c05422275bc73616e5a359eb245ec9a923bdL68
Please
Investigation Details ๐ต๐ปโโ๏ธ
https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-861669272 paulo-coutinho/pdfium-lib/blob/master/modules/wasm.py (how the project uses
emscripten
)Single-line Reproduction ๐ฌ
๐๐ป
``` CreateEx memalign memset free memalign memset free memalign memset Destroy CreateEx memalign memset free memalign memset free memalign memset Destroy CreateEx memalign memset free memalign memset free memalign memset Destroy CreateEx memalign memset free memalign memset free memalign memset Destroy CreateEx memalign memset free memalign memset free memalign memset Destroy ```496
ร496
ร4
โmemalign
โฏ
trigger on each iteration ๐๐ป (CLICK/TAP HERE TO REVIEW EXECUTION)๐๐ป
``` CreateEx memalign memset free memalign memset free memalign memset Destroy CreateEx Destroy CreateEx Destroy CreateEx Destroy CreateEx Destroy ```496
ร496
ร4
โ๏ธmemalign
โฏ
trigger only once ๐๐ป (CLICK/TAP HERE TO REVIEW EXECUTION)I kindly ask
emscripten
experts for their help in finding how to compile the said project in a way that resolves this issue.Investigation Details ๐ต๐ปโโ๏ธ
https://github.com/paulo-coutinho/pdfium-lib/issues/33#issuecomment-861669272 paulo-coutinho/pdfium-lib/blob/master/modules/wasm.py (how the project uses
emscripten
)FPDFBitmap_CreateEx
call treeFPDFBitmap_CreateEx
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/fpdfsdk/fpdf_view.cpp#799pdfium::MakeRetain
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/retain_ptr.h#174CFX_DIBitmap::Create
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxge/dib/cfx_dibitmap.cpp#27FX_TryAlloc
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/fx_memory.h#48Calloc
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/core/fxcrt/fx_memory.cpp#109PartitionAllocGenericFlags
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/third_party/base/allocator/partition_allocator/partition_alloc.h#394PartitionAllocatorGeneric
โ https://pdfium.googlesource.com/pdfium/+/refs/heads/chromium/4542/third_party/base/allocator/partition_allocator/partition_alloc.h#517