emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.88k stars 3.32k forks source link

`locateFile`: Support loading a data URL or Uint8Array #20025

Open broccolihighkicks opened 1 year ago

broccolihighkicks commented 1 year ago

locateFile takes a relative file path and returns a URL.

I do not think data URL's work when running in Node.js.

esbuild and other bundlers allow bundling .wasm files as Uint8Array's.

It would be useful to be able to return a Uint8Array or data URL from locateFile to bundle the .wasm with a JS bundle (this would avoid having to ship a .wasm file to a CDN or with a Node.js script).

I know the SINGLE_FILE emcc arg exists, but many Emscripten projects avoid this to make the output files more accessible/debuggable.

sbc100 commented 1 year ago

I'm not sure how this would be any different to using -sSINGLE_FILE. If you are trying to avoid bundling the extra wasm file.. where else would you embed the Uint8Array data if not in the single js file?

sbc100 commented 1 year ago

If there are accessibility/debuggablily problems with the existing -sSINGLE_FILE options perhaps we can address those rather than creating another way to embed wasm data ?

sbc100 commented 1 year ago

I do not think data URL's work when running in Node.js.

I believe we have code to support data URLs in node. Indeed it looks like this is how -sSINGLE_FILE works and we have code to handle the loading of the wasm file from data URLs: https://github.com/emscripten-core/emscripten/blob/ef3fd077475a5124cf3dd8047f1f34bcaa05e333/src/base64Utils.js#L33-L41

broccolihighkicks commented 1 year ago

I'm not sure how this would be any different to using -sSINGLE_FILE.

Because many projects have SINGLE_FILE=0 hardcoded and distribute the .wasm and .js files.

The Emscripten toolchain is quite complex to install. So having the ability to use my own bundler (esbuild) and bundle any emscripten project's distributed .wasm would save me having to fork and maintain a repo, install the emscripten toolchain and set SINGLE_FILE=1.

I have come across at least 4 emscripten projects with SINGLE_FILE=0 hardcoded.

I am returning

`data:application/octet-stream;base64,${b64}`

from locateFile in Node.js, I think emscripten tries to read it as a file name?

sbc100 commented 1 year ago

I'm not sure how this would be any different to using -sSINGLE_FILE.

Because many projects have SINGLE_FILE=0 hardcoded and distribute the .wasm and .js files.

The Emscripten toolchain is quite complex to install. So having the ability to use my own bundler (esbuild) and bundle any emscripten project's distributed .wasm would save me having to fork and maintain a repo, install the emscripten toolchain and set SINGLE_FILE=1.

I see, so you are trying to take existing projects and basically convert them to using SINGLE_FILE after they have already be compiled? i.e. and bundling phase?

I have come across at least 4 emscripten projects with SINGLE_FILE=0 hardcoded.

As an aside that seems rather odd -sSINGLE_FILE=0 is the default and therefore does nothing.

I am returning

`data:application/octet-stream;base64,${b64}`

from locateFile in Node.js, I think emscripten tries to read it as a file name?

Ah yes, I think maybe the code for handling data URLs is behind SUPPORT_BASE64_EMBEDDING which is only set in SINGLE_FILE mode: https://github.com/emscripten-core/emscripten/blob/ef3fd077475a5124cf3dd8047f1f34bcaa05e333/src/preamble.js#L643-L652

I guess we could consider supporting Uint8Array here.. but it might be nice to remove the existing data url support in that case to avoid having multiple ways to do the same thing. I guess we should see how much code/complexity this adds. Would you be interested in working on a PR?

broccolihighkicks commented 1 year ago

My preference would be to support all of:

Uint8Array
data:application/octet-stream;base64
data:application/wasm;base64
URL.createObjectURL URL's

Some bundlers may expose either base64 or Uint8Array, so having both could prevent having to translate between the two before passing to emscripten.

I'm sorry I do not have time to work on the PR, it does interest me but I do not have the time at the moment.

Thanks for your input though, this helps as now I know I am using the locateFile API correctly.

sbc100 commented 1 year ago

I'm not saying I would be completely opposed to such a change but in emscripten we have historically suffered from trying to support all ways to doing things, which adds a complexity and code size, and can make test all possible configurations very hard.

broccolihighkicks commented 1 year ago

This is true it would add more possible code branches.

SugarRayLua commented 9 months ago

Hi! I'm a novice mobile iOS programmer that has a similar interest as the poster of the issue re: using dataURIs but for a different reason: I'm interested in producing fully inline .html files from Emscripten projects that can be distributed and ran on mobile devices without hosting on a server. .wasm and Emscripten are great cross-platform solutions for projects but mobile iOS devices don't permit opening and running .html files in the browser that rely on separate .js and .wasm files. I'd like to be able to create projects that utilize .wasm via Emscripten and distribute them to others to use on their mobile devices without having to host a server to do so. I've been successful in inlining some other non-.wasm binary files into .html, but the Emscripten code is more complicated.

For instance, one developer has created a small Lua extension API that can export .html as its product. It simply produces an .html file, the Emscripten js glue file, and the .wasm file. I'd like to embed the js glue file along with the .wasm in the .html. I don't mind converting the .wasm to a base 64 data URI but would appreciate clarity from the responses above if currently there was a way that I could then pass that base 64 data URI into the Emscripten js glue code instead of having locateFile look for an external file://.

Is that currently possible (by modifying the current js glue code) or is that something that Emscripten is considering for the future?

Thanks!

sbc100 commented 9 months ago

@SugarRayLua is there any reason for you not to use the existing -sSINGLE_FILE mode (for embedding the wasm into the JS as base64)?

SugarRayLua commented 9 months ago

Thanks, @sbc100, I'll try that. Part of the issue was that despite following the Emscripten documentation, it was somewhat challenging for me as a novice programmer to set up Emscripten to run on my Mac. Therefore, I've been primarily working as a secondary person in the process, attempting to integrate FOSS previously built Emscripten .wasm inline to help encourage other developers to integrate their projects as single file projects. When I last checked, Emscripten was working on my Mac, and I relayed your suggestion to the current developer I am working with who suggested I try compiling their project with the -sSINGLE_FILE mode you suggested. I'm on mobile iOS during the week (hence the reason for my request) but can access Mac on the weekends and will try your suggestion. If it works, will then attempt to inline the SINGLE_FILE js/.wasm into the .html and report back 👍

SugarRayLua commented 9 months ago

@sbc100,

The developer I was working with did get -s SINGLE_HTML to work and accomplish the same thing, thanks!

SugarRayLua commented 8 months ago

@sbc100, PS, I personally got the SINGLE_FILE option working with emscripten on my MAC and was able to successfully create a proof of concept SINGLE_FILE .html to which I added a small custom Javascript script that allowed users on mobile device to choose a file to upload for which the custom Javascript script would then write to the emscripten compiled .wasm's virtual filesystem and then the compiled .wasm code would access that information in its virtual file system and print it.

However, I noticed that the SINGLE_FILE .html that was produced was, in effect, a "demo" page with the Emscripten logo on it and which contained a small canvas which rendered my .wasm's printed results (stdout). That worked fine for my proof of concept demo but wouldn't be ideal for a production project where wanted to display my own .html page. Is there an option when compiling to a SINGLE_FILE .html to disable the production of that "demo" page (i.e. have the SINGLE_FILE compiled .html contain the .c code compiled to .wasm and the necessary .js functionality to be able to access that .wasm code in the .html without displaying the Emscripten logo and the custom canvas)? If not, is the presumption then that the developer would then modify the produced .html to their needs?

Appreciate any further clarification you might have on this issue.

Thanks!

sbc100 commented 8 months ago

In terms if creating a SINGLE_FILE html page, there are a couple of options:

  1. Use -sSINGLE_FILE and output .js file, then take the JS file and embed is your own html file yourself inside of a <script> tag.
  2. Use the --shell-file to provide your own .html shell template. See https://emscripten.org/docs/compiling/Deploying-Pages.html#build-files-and-custom-shell.

I recommend doing (1) since then you completely control your html page and you don't need to figure out the html shell template syntax used by emscripten.

SugarRayLua commented 8 months ago

Thanks, @sbc100 for the clarification!

Sorry, one additional novice clarification:

If one chooses option (1), the .js file still contains also contains compiled .c code as .wasm correct (just embedded in .js script rather than an entire .html document)?

Similarly, if one chose option (1), then one could interact with the compiled .wasm code the same as if one created a SINGLE_FILE .html correct (e.g. write a separate javascript function that writes to emscripten's virtual filesystem and then call the compiled .c code function that does something with the file that was written into the virtual filesystem from the additional javascript)?

Thank you for the additional clarifications.

sbc100 commented 8 months ago

Thanks, @sbc100 for the clarification!

Sorry, one additional novice clarification:

If one chooses option (1), the .js file still contains also contains compiled .c code as .wasm correct (just embedded in .js script rather than an entire .html document)?

Yes, exactly.

Similarly, if one chose option (1), then one could interact with the compiled .wasm code the same as if one created a SINGLE_FILE .html correct (e.g. write a separate javascript function that writes to emscripten's virtual filesystem and then call the compiled .c code function that does something with the file that was written into the virtual filesystem from the additional javascript)?

Yes.

The only difference is that it would be up to you to embed the .js content directly into your <script> tag.

SugarRayLua commented 8 months ago

Got it, thanks! 👍😊

SugarRayLua commented 8 months ago

PS, @sbc100 , I made my question and project into a github if others have the same question (the first github repository I created):

https://github.com/SugarRayLua/wasmLoadLocalFiles

I acknowledged you for your help.

Thanks again 😊