Closed petkaantonov closed 8 years ago
a pull request or some other request for specific action is likely to get more attention here fwiw
Well I request your opinion before I make it PR ready (it's a normal user module right now)
I like the idea of making url
faster, but replacing the whole module frightens me a little
I think there is nothing wrong with using faster parser as it's just an implementation details as long as API stays the same and all existing tests pass
On Wed, Jan 28, 2015 at 1:39 PM, Vladimir Kurchatkin < notifications@github.com> wrote:
I like the idea of making url faster, but replacing the whole module frightens me a little
— Reply to this email directly or view it on GitHub https://github.com/iojs/io.js/issues/643#issuecomment-71921851.
I would love to see this on io
@xaka It's about subtle things, like data properties vs. accessor properties. It's hard to foresee everything once whole module changes. This requires major version bump at least.
That's a good point. Switching from properties to accessors means breaking API as things like console.log(...) would work differently and people might be using the stdout for instance for their own reasons, etc.
On Wed, Jan 28, 2015 at 2:36 PM, Vladimir Kurchatkin < notifications@github.com> wrote:
@xaka https://github.com/xaka It's about subtle things, like data properties vs. accessor properties. It's hard to foresee everything once whole module changes. This requires major version bump at least.
— Reply to this email directly or view it on GitHub https://github.com/iojs/io.js/issues/643#issuecomment-71931629.
this is one of the reasons why i'm trying to get rid of the util._extend()
usage. otherwise, i fear too much would break.
Can't we lazily patch the stringify and json transforms to avoid API change/break?
The getters can be made enumerable (but they're still on the prototype) and the Url class can implement toJSON
method. So that leaves util._extend()
, which should be internal method?
It can also be considered to make the properties just eager data properties, it would still be 11x faster. But often many of the properties are not needed (especially .href?) so it could be a bummer.
fixed at #1561
Reopening since it was reverted :smile:
Let's use this issue for further discussion on the new url
implementation.
Previous discussions at #933, #1561, #1591 and #1602.
I think we could probably plan to get this into 3.0.0 with time to let the ecosystem know / start updating.
My question would be how much of a perf difference would switching to plain properties over accessors have. If we can keep the API and get most of the perf, that'd be a win-win.
perf is not related to accessors, but data property api is terrible and not like browser api at all :)
I'd question how browserlike we want to go. Do we really want something like this?
> document.location.hash = null
> document.location.hash
"#null"
I assume the intention to follow browsers comes form wish of writing "isomorphic" apps and following standards in general (even though it gives you nonsense like #null
) ?
I'd prefer to not have to run everything though toString(), which is probably what the browser does.
(I guess that's maybe not a bad thing?)
(copypasta'd from #1591)
This will probably find its way back to the TC at some point, but as I explain in npm/npm#8163, @isaacs and I don't think the right thing to do is to change npm to match the new url
interface. I'm happy to discuss our reasoning, but ultimately @isaacs made the call, and I agreed, that this set of changes is sufficiently problematic to require a lot more discussion and careful planning to land.
I think we could probably plan to get this into 3.0.0 with time to let the ecosystem know / start updating.
I will simply just improve the performance, if it wasn't for lazy .href it could even be done in a patch. But only making .href
accessor should have no or minimal compatibility issues
I'd question how browserlike we want to go. Do we really want something like this?
document.location.hash = null document.location.hash "#null"
The point is not this but accessors would have allowed all components of the url be automatically up to date when one is changed.
I agree it would be much better to match browsers (as @isaacs has always suggested; surprised to see the about-face) and to maintain a consistent state between the different properties.
If we can't manage to do that with require("url")
then maybe we'll instead want to work on a standards-compliant URL
global that does match browsers (including consistency guarantees). That would allow the more drastic changes necessary for compliance, e.g. the introduction of searchParams
and so on. This might fall out naturally out of the work I'm doing in V8/Blink at my day job anyway (e.g. it might become just turning on a build flag in V8).
My suggestion would be:
In io.js 2.x use the same parser as before, but emit a warning on deletion. A few ideas:
configurable: false
so if people try to delete it, this operation will fail (strict mode only)url.format
(performance hit?)In io.js 3.x drop support for that.
If npm won't change url interface until then, we should consider applying floating patch.
In case anyone didn't notice, 2.0.0 was released so anything backward-incompatible (or probably even annoying, like a warning) would have to be in 3.0.0.
Well there is nothing wrong with introducing a warning in a minor release.
The intention is not to follow browsers for the good of writing isomorphic apps.
The intention is to follow url resolution logic like browsers, so that crawlers and other Node.js programs behave reasonably. The url object parsing is based on the browser location
global and <a>
html tag fields, but it also has to support being passed to http.request
and friends. So, it's not completely identical, and needn't be.
A 100% isomorphic browser-style accessor-using auto-updating url object is a great idea. It belongs in npm, not in core.
Will this make it into 4.0.0?
That depends on @petkaantonov right now. See https://github.com/nodejs/io.js/pull/1650.
The #1650 is continued at #2303 now.
Refs: https://github.com/nodejs/node/pull/7448 ... a WHATWG URL Parser implementation. Bit slower than url.parse()
currently but more correct to the standard.
Closing given the lack of continued activity and discussion on this. Can reopen if necessary.
@jasnell @petkaantonov testing this using node 7.5.0 on 16.04 (dual core kvm Intel Xeon E3-12xx v2) still shows huge difference
we found about this after cpu profiling our application which showed url.parse takes most of the time and we can't avoid making requests
we have switched to fast-url-parser
for now
any info on this maybe?
node benchmark/nodecore.js
misc/url.js parse(): 40645.099
misc/url.js format(): 36836.169
misc/url.js resolve("../foo/bar?baz=boom"): 38703.878
misc/url.js resolve("foo/bar"): 39136.199
misc/url.js resolve("http://nodejs.org"): 37845.184
misc/url.js resolve("./foo/bar?baz"): 38815.977
node benchmark/urlparser.js
misc/url.js parse(): 184957.14
misc/url.js format(): 170904.26
misc/url.js resolve("../foo/bar?baz=boom"): 181324.04
misc/url.js resolve("foo/bar"): 169483.69
misc/url.js resolve("http://nodejs.org"): 180839.33
misc/url.js resolve("./foo/bar?baz"): 174857.71
I just hit an issue in production where extremely long referer urls (3000 chars, 5500 chars, and 9000 chars) were causing our node instances to spike CPU usage and we tracked it down to the url
library. @pkoretic your fast-url-parser
is much more performant.
Anyone have an update on when this will be merged? I'll switch to fast-url-parser
for now.
Latest node.
$ node -v
v12.12.0
toxa@5fb84e50fca4:/tmp/urlparser$ node benchmark/nodecore.js
misc/url.js parse(): 68502.876
misc/url.js format(): 60017.421
misc/url.js resolve("../foo/bar?baz=boom"): 59991.764
misc/url.js resolve("foo/bar"): 64876.780
misc/url.js resolve("http://nodejs.org"): 63232.093
misc/url.js resolve("./foo/bar?baz"): 64872.514
toxa@5fb84e50fca4:/tmp/urlparser$ node benchmark/urlparser.js
misc/url.js parse(): 170138.65
misc/url.js format(): 191450.64
misc/url.js resolve("../foo/bar?baz=boom"): 208052.98
misc/url.js resolve("foo/bar"): 271166.70
misc/url.js resolve("http://nodejs.org"): 280692.39
misc/url.js resolve("./foo/bar?baz"): 255188.00
Still 2.5x:
$ node -v
v16.14.0
$ node benchmark/nodecore.js
misc/url.js parse(): 78603.333
misc/url.js format(): 72242.340
misc/url.js resolve("../foo/bar?baz=boom"): 71972.167
misc/url.js resolve("foo/bar"): 78404.141
misc/url.js resolve("http://nodejs.org"): 78088.377
misc/url.js resolve("./foo/bar?baz"): 78914.819
$ node benchmark/urlparser.js
misc/url.js parse(): 169323.52
misc/url.js format(): 199909.74
misc/url.js resolve("../foo/bar?baz=boom"): 198722.90
misc/url.js resolve("foo/bar"): 226340.86
misc/url.js resolve("http://nodejs.org"): 159880.62
misc/url.js resolve("./foo/bar?baz"): 284006.64
(Original node issue https://github.com/joyent/node/issues/6788)
I have rewritten the url parser module of node core as it was/is a serious bottleneck in some of the techempower benchmarks
Running node's urlparser benchmark using iojs it's still 16x faster when not retrieving properties and 11x faster when retrieving all properties (which are lazy getters in my implementation). In absolute terms the current iojs urlparser throughputs 25k parses per second vs 400k lazy/270k eager parses per second.
format
andresolve
are also affected in similar magnitudes.