Open jwoertink opened 10 months ago
I noticed the self.route()
method uses a lot of string concatenation. I tried replacing that with a string builder, and that actually made it slower when benchmarking 10k urls.
Here's a basic action with the route
method overridden and all the macros stripped out.
require "lucky"
Lucky::RouteHelper.configure do |settings|
settings.base_uri = "http://localhost:3000"
end
class Products::Show < Lucky::Action
get "/products/:id" do
plain_text "testing"
end
def self.route(id, anchor : String? = nil) : Lucky::RouteHelper
path = path_from_parts(id)
query_params = {} of String => String
unless query_params.empty?
path += "?#{HTTP::Params.encode(query_params)}"
end
anchor.try do |value|
path += "##{URI.encode_www_form(value)}"
end
Lucky::RouteHelper.new :get, path
end
end
Products::Show.with(1).url
I think we want to switch from URI.encode_www_form
to URI.encode_path
and we can build the path and only call that method once instead of calling it for each path segment (and add it to the base url after). That's my guess as to the slowdown. Of course, if it is the encoding, then we can assume that the path helper will never be as fast as what he did with the raw string, but of course his way doesn't handle invalid characters in the URL.
Link to where I'm talking about: https://github.com/luckyframework/lucky/blob/3f15bc80cc5d26489dce76ffb5d2486720223ac9/src/lucky/routable.cr#L350-L365
Also, after looking at it again, we don't build the full url in that function so ignore the part about adding it to the base url. Switching to the new method and calling it at the end should just work. I did see differences in the docs around what it does with spaces. It seems like we want the default of URI.encode_path
so I'm wondering if this was a potential bug from the beginning. The spec would be to encode a path with a plus in it and it still be a vaild URL. I think the current converts that to a space and we don't want that. But, of course, I could be mistaken.
I actually thought about that too, but when I tried comparing the two, the main difference is encode_path
will use %20
where encode_www_form
will use +
for encoding spaces.... I'm not sure what other changes there are or what sort of things to expect here.
Now, I'm sure no matter what we do, it'll never be as fast as doing a raw string, and that's ok. I'm just hoping we can find some spot that gets us a bit closer than the 30% reduction we have now (according to those findings).
looks like %20
is the correct handling of spaces. Tested it out here https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI
Try using the URI.encode_www_form
overload that accepts an IO
. Would avoid creating intermediary strings.
Hi, sorry for my delay, I forgot to check out this issue.
Just to report on what happened. I was analysing code from a challenge from our local dev community and I participated using Crystal/Lucky. It performs super well, as expected. But there was a small problem. It is this line of code:
response.headers["Location"] = Api::Pessoas::Show.url(pessoa_id: pessoa.id)
If I use #url the performance goes down dramatically in the stress test. For the context, this is the Gatling simulation we have been using: https://github.com/zanfranceschi/rinha-de-backend-2023-q3/tree/main/stress-test (you have to download gatling and change run-test.sh to adjust the paths)
If I use simple string concatenation such as "/pessoas/#{pessoa.id}", I can easily reach the 46k inserts at the end of the simulation. If I use the #url method it can go down in half of that total.
The simulation ramps up from 3 to 600 sessions doing POSTs to create and then GET the Location from the header to check.
One semantic mistake I made is that the Location header don't need the #url, just the #path is sufficient. And in that case, I don't see a performance hit.
What made this even more curious is that I was previously trying this:
Application.settings.APP_DOMAIN + "/pessoas/#{pessoa.id}
And for some weird reason, this one line also drops the number of inserts in the end.
I tried to make a benchmark to isolate #url, #path, string concatenation, here: https://github.com/akitaonrails/rinhabackend-lucky-crystal-api/blob/master/spec/actions/benchmark_bug_spec.cr
And although, of course, #url is a bit slower because it does more. I don't see why that difference would drop performance so much in the end.
I am still not sure if this is a bug or if there's something wrong in the test scenario. I hope those details help a bit.
@akitaonrails it would be interesting to see if my PR https://github.com/luckyframework/lucky/pull/1829 makes a difference in this at all.
@jwoertink will do, but a noob question. In my shard.yml if I change version: ~> 0.1.0 to branch: issues/1827 to get that branch, then dependency resolution breaks for lucky_task. how can I fetch from your branch without breaking shards?
You can add a new file shard.override.yml
and add the Lucky dep in there. It should override everything.
I'm sorry, I know I'm doing something very stupid but I created a shard.override.yml with:
dependencies:
lucky:
path: ../../crystal/lucky
also tried
dependencies:
lucky:
github: luckyframework/lucky
branch: issues/1827
I run shards update but I always end up with:
Unable to satisfy the following requirements:
- `lucky_task (~> 0.1.1)` required by `shard.yml`
- `lucky_task (0.1.1)` required by `shard.lock`
- `lucky_task (~> 0.2.0)` required by `lucky 1.0.0`
- `lucky_task (~> 0.1.1)` required by `avram 1.0.0`
- `lucky_task (~> 0.1.1)` required by `carbon 0.3.0`
Failed to resolve dependencies
it only passes this if I do:
dependencies:
lucky:
path: ../../crystal/lucky
version: 1.0.0
lucky_task:
path: ../../crystal/lucky_task
version: 0.1.1
but then it fails with:
Writing shard.lock
Building: lucky.exec
Building: lucky.watch
Building: lucky.gen.action
Building: lucky.gen.action.browser
Error target lucky.gen.action.browser failed to compile:
Showing last frame. Use --error-trace for full trace.
In tasks/gen/action/browser.cr:26:19
26 | page_task = Gen::Page.new
^--------
Error: undefined constant Gen::Page
cc @jwoertink
Ok, I decided to change my docker-compose to COPY your routable.cr directly over lib/lucky/src/lucky after shards install.
# Dockerfile
COPY ./lib/lucky/src/lucky/routable.cr /app/lib/lucky/src/lucky/routable.cr
Then, I made sure I created the Location header like this:
response.headers["Location"] = Api::Pessoas::Show.url(pessoa_id: pessoa.id)
This is semantically incorrect; it should be #path() instead, but this was causing problems in the stress test. Since my initial report, I did some other tweaks, but still, this is the Gatling results after the 3 minutes of stress. This is the baseline:
================================================================================
2023-09-15 00:06:43 205s elapsed
---- Requests ------------------------------------------------------------------
> Global (OK=116643 KO=637 )
> criação (OK=54635 KO=0 )
> busca inválida (OK=4190 KO=0 )
> busca válida (OK=9590 KO=0 )
> consulta (OK=48228 KO=637 )
---- Errors --------------------------------------------------------------------
> status.find.in([200, 209], 304), found 404 637 (100.0%)
The last item, "consulta" are "GET" requests for the Location URL. It misses quite a few. 637 knockouts.
Now, WITH your patch 1827 I have this result:
================================================================================
2023-09-15 00:19:11 205s elapsed
---- Requests ------------------------------------------------------------------
> Global (OK=117267 KO=13 )
> busca válida (OK=9590 KO=0 )
> busca inválida (OK=4190 KO=0 )
> criação (OK=54635 KO=0 )
> consulta (OK=48852 KO=13 )
---- Errors --------------------------------------------------------------------
> status.find.in([200, 209], 304), found 404 13 (100.0%)
It DID make a measurable difference. It's not constant. I tested a couple of times and another time, it was 7 KOs, but the order of magnitude is staggering, from 600 down to 10, so it makes a huge difference under stress.
As I said, using the correct .path(pessoa_id: pessoa.id)
is still faster and yields 0 KOs.
In any case, your patch seems to make a positive impact and I didn´t see any regressions in the scope of my test. Thanks!
Sweet! Thanks for testing that @akitaonrails. Sorry I missed your first response, but I'm glad you were able to test it. I think we're in a good spot to get a new release up, so I'll start working towards that.
Ref: https://twitter.com/AkitaOnRails/status/1697651647273427429?s=20
TL:DR; It seems that just using
SomeAction.url
causes a pretty significant hit to performance when doing a stress test.Here's a snippet of code from a source repo where this was tested. https://github.com/akitaonrails/rinhabackend-lucky-crystal-api/blob/000a86d0e1d9bdee245c7873f2f0440ed44c7cf8/src/actions/api/pessoas/create.cr#L19-L20
There's a lot of significant moving parts in this area, so it's possible the root cause could be located in some other shard, but we will use this issue to drop any notes or discoveries.
/cc. @akitaonrails