graphql-rust / juniper

GraphQL server library for Rust
Other
5.72k stars 425 forks source link

feat: use ArcStr for storing strings in Schema #1247

Open audunhalland opened 8 months ago

audunhalland commented 8 months ago

Fixes #819

This is an experiment that uses arcstr::ArcStr for storing strings within the GraphQL schema. String literals that are constant/come from macros no longer allocate memory.

ArcStr appears to fit perfectly for juniper in my opinion. It delivers extremely cheap literal strings, which is the most common usecase, while also supporting dynamic schemas based on extensive use of custom TypeInfo (which is how we're using juniper at work), potentially without extra memory allocation. There's of course the "cost" that users have to see ArcStr in the dynamic schema building APIs, but I think that with these new APIs exposed it should be much clearer when juniper needs to allocate string memory and where it's "free" (using literal!). In short, this choice is now put in the hands of the user.

BREAKING CHANGE:

Benchmark results

master

Sync vs Async - Users Flat - Instant/Sync/1
                        time:   [24.076 µs 24.100 µs 24.124 µs]
                        change: [+0.9795% +1.1412% +1.3003%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/1
                        time:   [26.135 µs 26.170 µs 26.218 µs]
                        change: [-0.9653% -0.7074% -0.4492%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/1
                        time:   [25.893 µs 26.171 µs 26.474 µs]
                        change: [+1.6342% +4.1267% +7.0669%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Sync/10
                        time:   [25.227 µs 25.244 µs 25.265 µs]
                        change: [+0.1578% +0.3255% +0.4799%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/10
                        time:   [25.831 µs 25.871 µs 25.920 µs]
                        change: [+2.0800% +2.5706% +3.0084%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Threadpool/10
                        time:   [25.397 µs 25.685 µs 25.936 µs]
                        change: [-1.7187% -0.4294% +0.8521%] (p = 0.50 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

arcstr

Sync vs Async - Users Flat - Instant/Sync/1
                        time:   [9.8646 µs 9.8708 µs 9.8771 µs]
                        change: [-59.007% -58.946% -58.885%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Single Thread/1
                        time:   [11.486 µs 11.499 µs 11.512 µs]
                        change: [-56.273% -56.158% -56.047%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/1
                        time:   [10.803 µs 10.851 µs 10.897 µs]
                        change: [-60.022% -58.793% -57.906%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  13 (13.00%) high mild
  1 (1.00%) high severe
Sync vs Async - Users Flat - Instant/Sync/10
                        time:   [11.158 µs 11.173 µs 11.191 µs]
                        change: [-55.883% -55.816% -55.743%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
Sync vs Async - Users Flat - Instant/Async - Single Thread/10
                        time:   [10.419 µs 10.428 µs 10.436 µs]
                        change: [-59.952% -59.862% -59.782%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
Sync vs Async - Users Flat - Instant/Async - Threadpool/10
                        time:   [11.355 µs 11.487 µs 11.611 µs]
                        change: [-55.387% -54.856% -54.295%] (p = 0.00 < 0.05)
                        Performance has improved.

All of the improvements should be related to schema creation, not the query executor.

LegNeato commented 8 months ago

Sweet, this looks great! I don't have time to look at this right now but perhaps @tyranron has opinions.

tyranron commented 8 months ago

@LegNeato this is definitely on my agenda in near future. However, the diif is huge enough, which will take some time.

tyranron commented 7 months ago

@audunhalland I'm going to review this in next few days and release in 0.17.0.

audunhalland commented 7 months ago

@tyranron cool, I'm not 100% happy with all parts of this PR, there are surely things that can be iterated on later. Especially I see DynType and AsDynType as a kind of hacky way to limit the scope and get it over the line.

edit: also note for the review: I'm not sure I understand the reflect module fully, like e.g.:

impl<S> reflect::WrappedType<S> for ArcStr {
    const VALUE: reflect::WrappedValue = 1;
}
tyranron commented 7 months ago

@audunhalland

edit: also note for the review: I'm not sure I understand the reflect module fully, like e.g.:

impl<S> reflect::WrappedType<S> for ArcStr {
    const VALUE: reflect::WrappedValue = 1;
}

That should be fine, yes.

reflect::WrappedType::VALUE is a tricky way to encode composed GraphQL types in prime numbers, since we're kinda limited with const evaluation abilities in Rust for now. 1 means a primitive, non-composite type, which ArcStr represents, the same way as String and similar.

audunhalland commented 7 months ago

@tyranron an idea: Could we somehow make From<ArcStr> a supertrait of ScalarValue? Then one could make a custom scalar value that doesn't need to clone string buffers that are already ArcStr. Think introspection queries, etc.

(Ideally From<&ArcStr> (which doesn't require cloning up front), but that's not possible without making ScalarValue lifetime generic. Another idea is to add a method fn from_arcstr(&ArcStr) -> Self to trait ScalarValue itself.)

edit: fn from_arcstr(&ArcStr) could be a method with a default implementation that just delegates to From<String>

audunhalland commented 5 months ago

@tyranron I see there are some compile errors on the bikeshedded changes now. Should I help out with trying to resolve those?