rustwasm / wasm-bindgen

Facilitating high-level interactions between Wasm modules and JavaScript
https://rustwasm.github.io/docs/wasm-bindgen/
Apache License 2.0
7.78k stars 1.07k forks source link

Investigate caching of static strings #1386

Closed RReverser closed 5 years ago

RReverser commented 5 years ago

Motivation

Long ago I noticed that currently even constant strings are routed through entire encoding/decoding process on each pass between Rust and JS, becoming unnecessarily expensive part of the resulting code.

Now I've noticed that this affects even internal functions which need intermediate strings, thus affecting even more potentially hot paths.

E.g. this is a profile from my WASM library that heavily relies on JS iteration - js_sys::try_iter - which, in turn, recreated same constant string for the iteration protocol over and over.

image

We can fix this particular function but, ideally, we should have a more generic solution for constant strings.

Proposed Solution

Either:

  1. Add a step to the postprocessor to walk over the generated WASM, detect calls to JsValue::from_str with pointers to a data section and statically extract these into JS.
  2. Do the same in runtime by using a cache backed by lazy_static! (but then we need to somehow detect whether a string is in the data section, which is a bit harder in runtime but not impossible).
  3. Add some sort of a proc-macro js_const_str!("...") that would extract these statically like (1), although could make it easier to detect by being more expicit.

All of the proposed solutions require choosing the right balance between complexity of the solution itself vs complexity of usage from API point of view (we don't want to add a function that most users would forget to use).

RReverser commented 5 years ago

I specialised try_iter in my code to try Array::is_array + unchecked_ref::<Array>().values().into_iter() first, and the whole library benchmark got 5x faster 😱

I wonder how many more cases are affected by these hidden costs.

alexcrichton commented 5 years ago

Thanks for the report! Could you expand a bit on how try_iter is connected to static strings in this case? Those seem to be naively unrelated, but hey 5x improvements are always nice

RReverser commented 5 years ago

I was surprised about that connection showing up in profile as well, but from looking at the source it might be related to JsValue::from("next") being created each time you iterate over a new value?

Note that I'm mostly iterating over small tuple-like arrays (but need this code in a generic context to accept arbitrary iterables), so this overhead could well be the reason.

alexcrichton commented 5 years ago

Oh I think that's fixable with a change like:

diff --git a/crates/js-sys/src/lib.rs b/crates/js-sys/src/lib.rs
index 0a51ec7d..05a5a656 100644
--- a/crates/js-sys/src/lib.rs
+++ b/crates/js-sys/src/lib.rs
@@ -1107,14 +1107,20 @@ pub fn try_iter(val: &JsValue) -> Result<Option<IntoIter>, JsValue> {
         return Ok(None);
     }

+    #[wasm_bindgen]
+    extern "C" {
+        type MaybeIterator;
+        #[wasm_bindgen(getter, method)]
+        fn next(this: &MaybeIterator) -> JsValue;
+    }
+
     let iter_fn: Function = iter_fn.unchecked_into();
     let it = iter_fn.call0(val)?;
     if !it.is_object() {
         return Ok(None);
     }

-    let next = JsValue::from("next");
-    let next = Reflect::get(&it, &next)?;
+    let next = it.unchecked_ref::<MaybeIterator>().next();

     Ok(if next.is_function() {

Want to test that out and see if it works for you?

RReverser commented 5 years ago

Haha, that's literally what I did locally too, even the name of MaybeIterator matches :)

But, as mentioned above, I'm rather interested in a more generic solution to needless copies of constant strings, and decided to open this issue for that discussion/tracking.

RReverser commented 5 years ago

Btw, strangely, replacing that doesn't make nearly as big difference as checking for is_array and using .values(). I wonder if there's another hidden string somewhere...

UPD: __wbindgen_string_new still shows up taking ~60% of total time on microbenchmarks involving small arrays, so apparently there is.

UPD: Ah no, sorry, I just didn't add [patch.crates-io] to this mini benchmark to point to local wasm-bindgen.

fitzgen commented 5 years ago

I think it is worth providing string interning infrastructure.

This shouldn't require anything special from wasm-bindgen, such as JS glue codegen support.

Therefore, the question is raised of whether it should be

I'm not convinced that we need to do the interning at compile time with proc-macros; I think we can dynamically intern at runtime via fn intern(s: &'static str) -> &'static JsString or something like that. I don't think the overhead will be noticeable (although adding compile-time interning with proc-macros post facto should also be semver compatible, if we find that we do actually need it).

RReverser commented 5 years ago

I tried several approaches so far:

alexcrichton commented 5 years ago

I think I agree with @fitzgen here that this is best done in a separate crate for now rather than in wasm-bindgen itself, so I'm going to go ahead and close this.

RReverser commented 5 years ago

that this is best done in a separate crate for now rather than in wasm-bindgen itself

Hmm, this can be hard to do in a separate crate in a way suggested above, because it needs to rewrite WASM itself in wasm-bindgen compatible manner...

But this is not an easy or short-term task so I guess it's fine to have it closed.

My only beef with closing long-standing issues is that it usually increases likelihood of someone not finding an open issue and raising exactly same suggestion.

RReverser commented 5 years ago

because it needs to rewrite WASM itself in wasm-bindgen compatible manner...

@alexcrichton Do you think this issue could be reopened in light of these details? I can still work on a PR, but it's usually much better to be able to link to an existing issue.

alexcrichton commented 5 years ago

Unless we have compelling data we should do this I'd rather not accumulate a large list of "wish to haves in the far and distant future" issues if we can.

Pauan commented 5 years ago

I've been investigating (and improving) the performance of dominator, and I ran into this issue. It turns out, strings are really slow. They are orders of magnitude slower than everything else.

They are slow because of all the encoding and copying, and in addition also because it adds a lot of extra garbage collection pressure.

The end result is that right now, Rust apps are doomed to be slower than JS apps, simply because JS apps avoid the huge costs of encoding + GC.

Unfortunately, strings are unavoidable, because so many native web APIs use strings. And WebIDL probably won't help much (at least not in the short term).

I re-implemented over a dozen of the web-sys bindings so that they accept &JsString instead of &str, and then I implemented a very simple intern function which does runtime interning with a BTreeMap. This resulted in dramatically faster performance.

We can do a lot better than that, though. Nobody should need to re-implement all of web-sys just to get acceptable string performance. It gets worse when you consider that libraries like gloo use web-sys, and therefore I had to re-implement gloo as well.

I think we should do all of @RReverser 's ideas: have very fast automatic static caching for &'static str, have automatic runtime caching for &'a str (probably using an LRU cache or similar), and also provide a manual form of caching (such as an intern function).

That will give very good performance for static strings, good performance for non-static strings, and predictable performance for manually interned strings.

wasm-bindgen should be able to do the caching automatically, it just has to add in a call to the intern function for #[wasm_bindgen] functions which accept &str or String.

I'm willing to make a PR for this, if the idea is accepted. Or if you prefer we can go through the RFC process (I can write the RFC).

alexcrichton commented 5 years ago

That all I think is some very compelling data and I'd be inclined to agree!

I think one alternative is to add bindings for both &str and &JsString, but I don't think that's tenable really since we don't want to duplicate everything everywhere. Having a global cache I think is probably a better option here.

One thing I'd ideally like to see is a global cache that's tunable and optional, so we might not have it on by default for the wasm-bindgen crate (via a feature) but then it could be enabled for extra performance if desired (and tuned if needed). @Pauan if you're willing to make a PR I'd be willing to review it :)

Pauan commented 5 years ago

I think one alternative is to add bindings for both &str and &JsString, but I don't think that's tenable really since we don't want to duplicate everything everywhere.

That was going to be my first suggestion, but I decided against it since as you said it's untenable: gloo would need to provide &JsString bindings as well, and so would any other libraries that depends on web-sys.

The other thing I thought about was to create a new trait, something like AsStr, and then implement it for &str, String, and JsString. Then all of the APIs would take <A: AsStr> instead of &str.

However, that won't work for #[wasm_bindgen] bindings, and it's still very inconvenient for the user, since the user now has to make a choice: 1) they can pass a &str and get bad performance, or 2) they can use intern to manually create a JsString and then pass that. Both options are bad.

One thing I'd ideally like to see is a global cache that's tunable and optional

I absolutely agree.

if you're willing to make a PR I'd be willing to review it :)

Great, I'll get started on that, then!

RReverser commented 5 years ago

Oh I have't noticed that @Pauan has since implemented this in https://github.com/rustwasm/wasm-bindgen/pull/1612, this is pretty cool!

One nitpick is that there doesn't seem to be any scope for interned strings and no way to free them once they're no longer necessary?

My usecase is any Wasm module that creates classes. In this case it needs to intern strings that are not alive for the entire duration of the module, but rather will be unnecessary as soon as class is freed from the JS side. If my understanding is correct, these currently would never get evicted, but rather keep leaking memory for each class instance.

Would it be worth to expose wasm_bindgen::InternCache structure (already available internally) for this usecase, so that the entire cache would get freed once it's dropped?

Pauan commented 5 years ago

Would it be worth to expose wasm_bindgen::InternCache structure (already available internally) for this usecase, so that the entire cache would get freed once it's dropped?

I don't like the idea of exposing internal details, but I definitely agree that we should have an API for clearing out items from the cache.

It was omitted from the initial implementation to keep things as simple as possible, but it was always planned to add it later.

Pauan commented 5 years ago

@RReverser PR is up: https://github.com/rustwasm/wasm-bindgen/pull/1828