llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.37k stars 12.14k forks source link

__wasm_apply_data_relocs is absolutely enormous #55608

Open k1nder10 opened 2 years ago

k1nder10 commented 2 years ago

We are porting a big project (>1'000'000 lines of code) to wasm using emscripten. Everything had worked fine until we started using dynamic linking (MAIN_MODULE=2 & SIDE_MODULE=2). Since then we are running into the issue of exceeding the function body size limit. This function is __wasm_apply_data_relocs and it's added by wasm-ld I believe. We also noticed that this function is huge even in the targets where it fits the limit. (10-100x times bigger than the functions from our codebase). It feels like it does more work than actually needed.

What does this function actually do? What does the size of this function depend on? Are there any flags/features we can use to reduce its size?

It must be a common problem for every big project. Is it possible to split this function from our/your side somehow?

llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-webassembly

sbc100 commented 2 years ago

This function applies data relocations. Whenever the address of a symbol is stored in linear memory, but that address is not known until runtime (this is true for all side module symbols, as well as undefined main module symbols), we need to update that location once the modules are loaded and locations are fixed.

I'm not sure what could cause the massive explosion of data relocations, but I can imagine it could be the vtables used by C++?

There are a few things we can do:

  1. Figure out if the overall number of relocations can be reduced. I'm not sure how easy it is to inspect the relocations or to have wasm-ld report them, but we should investigate why you have so many in our program.
  2. Move to a more efficient encoding.. Perhaps we can find a way to reduce the size of the function by performing the relocations with fewer instruction.
  3. Split up the function into chunks (as you suggested)
k1nder10 commented 2 years ago

What do you suggest to us to get around this problem from our side? We were even thinking of manually splitting up this function in .wasm file though it's kinda ridiculous.

Are you guys planning on fixing this issue somehow? It seems like no big project can use dynamic linking because of it.

sbc100 commented 2 years ago

I think (1) needs to be investigated. The number of relocations should not be that high. Is it the side module or the main module is has the very large function?

Is your program using a lot C++ virtual functions? Do the main and side modules share a lot of C++ classes over the boundary. Emscripten recently switched to visibility=default which has been causing trouble for some folks. Can you try compiling with -fvisibility=hidden?

edit: recommend visibility=hidden

k1nder10 commented 2 years ago

Sorry, have to tell you that we build with SIDE_MODULE=1 & MAIN_MODULE=1. We figured out that DCE (side_module=2) is not an option for us as for many reasons.

This is the main module where the function exceeds the limit. Yes, our program has a lot of virtual functions. Yes, side modules and the main module share many C++ classes. Does it make sense to test -fisibility=default for SIDE_MODULE=1?

sbc100 commented 2 years ago

Can you try again with the latest emsdk (emsdk install tot) ? I'm curious to see if 198815e18dad helped in your case.

Also it is the side module all the main module that contains the huge relocation function? I would suggest building with -fisibility=hidden, yes (sorry I meant to say -fvisibility=hidden in my previous comment), although any code that you share between the main the side module would then need to be tagged in the source code as __attribute__(((visibility("default")))

k1nder10 commented 2 years ago

Yes, the commit you mentioned does help us. Though the function is anyway too big. Probably we'll hit the limit in the near future when more code is written.

k1nder10 commented 2 years ago

I can see you're discussing it here https://github.com/emscripten-core/emscripten/discussions/17374 Splitting this function up into chunks would be really great!

sbc100 commented 2 years ago

One reason I'm hesitant to invest in splitting up the function is that it doesn't address the overall code size issue.

k1nder10 commented 2 years ago

I see. However, we always hit the function size limit when experimenting with static/dynamic libs/-O -g flags, and many more. We have to keep in mind the limit all the time and go around it. Besides, we have to write code in such a big codebase thinking about data relocations, which is absolutely insane. So, it would at least enable us to do many common things and not think about this stuff. IMO it is absolutely worth splitting this up.

k1nder10 commented 2 years ago

You can leave it as a temporary solution until you find a better way to sort it out

sbc100 commented 2 years ago

I agree, in the absence of better solution, simply splitting it up into chunks seems like a reasonable solution. I'm not sure when I will get time to work on it though.

k1nder10 commented 1 year ago

I uploaded a patch to split this up. Can you review that? https://reviews.llvm.org/D140111 @sbc100

TerrorJack commented 2 months ago

I have a patch for the 19.x branch at https://gitlab.haskell.org/ghc/llvm-project/-/commit/7b3796e6bacdb51e171db21dca2c81da721775ec that splits up __wasm_apply_data_relocs. Don't have any time to go through the PR process, but in case anyone else is troubled by the same issue, feel free to pick that commit and give it a build.