Closed fratzinger closed 6 months ago
Thanks for this! I will have a look at this today or tomorrow. I have been thinking about taking a look at some other perf things too. I re-wrote most of feathers-mongo recently: https://github.com/feathersjs/feathers/pull/3366 And I was thinking about taking a look here to see if there were any potential perf gains, especially around counting. I have some new features in Straightline that require lots of counting.
@fratzinger I made a few other tweaks to this branch that should give us some more performace gains.
attributes
property (aka sql SELECT). This eliminates the need for the adapter commons select()
function in most case.$limit: 0
, the adapter now uses count
instead of findAndCountAll
.paramsToAdapter
now returns a limit: 1
when an id is present.I also refactored the patch and remove methods. They had grown confusing with all the config merging. So they both now use some simple logic branches instead of trying to handle _getOrFind()
.
Let me know what you think.
@DaddyWarbucks Perfect! Looks great. Glad to have all the select()
removed. Thanks for the quick response and the further improvements!
I added a commit with a few changes (sorted by relevance):
params.$returning
for id !== null
: I don't like this! As valid as it is for id === null
, it's implemented poorly for id !== null
. There's no feathers standard to express this. Returning []
for a single element does not make any sense. The ReturnType is wrong - nearly every of my afterHooks would throw errors. Nor does returning undefined
make any sense, since it is not a standard of @feathersjs/adapter-commons
. This problem existed before this PR though. I don't use params.$returning
at all, so I'm maybe the wrong person to argue with but I would like to remove this for id !== null
. The CI is expected to fail because there are tests that ensure this behavior. Please let me know what you think about that and how we should proceed. I'm happy to make any changes based on your opionion (even if it is restoring the currently expected behavior 😊 )
_getOrFind
is not used in the code base anymore, we can think about removing it sometimes. It's not private though, so it's maybe used in some codebases. What do you think about marking it as deprecated with jsdoc and remove it in a future major release?
patch
with id === null
: Why do you add $limit: ids.length
for result = ...
? Is this necessary? I think it can be removed.
Disclaimer: This is not a problem of this PR and maybe should be handled in a separate PR but it can be considered as a bug: getOrder
in paramsToAdapter
for id !== null
: In the FindOptions
for id === null
there is order
. Why is not present in FindOptions
for id !== null
? I think we can easily add it.
Thought I had but dropped: Maybe we should replace Model.update
and Model.destroy
with instance.update
and instance.destroy
for id !== null
. I think for _patch
that's an easy change. But for _remove
it's not so easy because we need to handle paranoid deletedAt
. Also this can be considered a breaking change because the called sequelize hooks would be different. I'm unsure about the performance gains anyways.
I reviewed the open issues. As of now this PR will close #384, #382. I added them in the initial message so they get closed automatically when merged.
I have a couple little more things I wan to try tonight. This is exciting! I use this for my largest project and its exciting to know we are about to get some perf boosts.
To Answer some of your questions
1 - Agreed on all accounts. It is weird and could/should be marked as deprecated and removed later. The user can handle it in their own hooks if they don't want to return it to the client.
2 - Also agreed. With my refactor of patch/remove, I also noticed it was now unused but did not remove it for the aforementioned reasons.
3 - The limit: 0
could technically be removed for both get
and patch
. I did a little research last night and using limit on a primary key does not gain or lose any performance (SQL knows how to handle it), but for non primary/unique/indexed keys it can add some performance. It is understood that whatever the user provides as the service id
should be primary/unique/indexed...but maybe it's not? Setting it in paramsToAdapter
handles the get
, although it is not needed 99% of the time I still like that it intently states the purpose that an id has one result. For patch
...same story with get
, it may provide some benefits for non primary/unique/indexed keys. But your app has other problems if thats the case, like get
will break, etc. It can definitely be removed. The limit in paramsToAdapter
could be removed too, but I don't dislike it in that case.
4 - Totally. That is something I was going to add tonight too. I did similar in the latest PR for the mongo adapter. And there are a few other things I wanto to try that could really speed up single updates.
5 - I am not sure what to do there.
This is fun! I highly enjoy working on this. To your points again:
$returning: false
for id !== null
: I don't know how to deprecate that. Use console.warn
when it's used? I would like to remove it now and let this PR lead to a major release.Happy to see the results of your experiments. Resolving no 1 is still on my list. I'll add that after your changes so we don't have any conflicts.
I'll also want to update all dependencies and migrate from mocha to vitest. If you have any thoughts on that, please let me know. Otherwise I'll do that change after this PR.
I am working on a number of things now! I will let ya know when Ia am done.
I pushed quite a bit of unfinished code last night just to get it off my computer. This is more finalized now. I will write out a longer comment addresses your notes and some other things I changed.
I will try to summarize the changes made and the reasoning behind each.
returning
- @fratzinger has pointed out some things I missed about returning
. I have misused this in some cases. Its main purpose is for Model.update
where it indicates that the method should return results when the underlying DB supports it rather than use the RETURNING
statement. This would offer a large perf gain on multi patches because we wouldn't need to lookup the results after the update. Perhaps we should move back to params.$returning
. I don't love the $
in the param name.
You will notice that for single patch
and remove
we now use instance methods instead of the bulk method. This is to more consistently align with sequlize's hooks with that is actually happening. A single patch should call the model's beforeUpdate
afterUpdate
instead of beforeBulkUpdate
and afterBulkUpdate
.
I also eliminated the ignoreSetters
param. This was only used on create
, when it should have been on update
and patch
too. I attempted to further use raw
to address this, but kinda stepped on my own feet. More on that at the bottom.
This was a rather simple update. If $limit === 0
, then just use the Model.count
method. We should see a marginal perf benefit here. When using Model.findAndCountAll
in this case, sequelize would still call findAll({ limit:0 })
under the hood (and doesn't noop). I am unclear if it actually sends a SELECT * LIMIT = 0
or not. But now we just do the count.
Really no change here. The paramsToAdapter
does now add limit: 1
. With some quick research I found that this will make little to no difference for primary/unique/indexed keys. But I still like that it states some intent.
Not many changes here either. We are more intelligently using the select
function. And the code has moved around a bit, but it is otherwise unchanged.
Some good perf gains here.
Model.count
instead of _get
to validate the query.sequelize.include === false
(which I think is most use cases), then the instance does not have its associations and we can select/return it without an additional lookup. Else, do the _get
to get results with includes.instance.update
method to better use sequelize hooks. Furthermore we can skip the additional lookup if sequelize.include == false
(which I think is most use cases) then the instance does not have its associations and we can select/return it without an additional lookup. Else, do the _get
to get results with includes.So this method is where I wanted to really use returning
. It makes lots of sense here, but as @fratzinger suggested it may have different results on some of the other methods 😦 . For this method returning === true
could save us the an expensive additional lookup if the method actually returns its results.
instance.destroy
method to better use sequelize hooks.$select
here too but I don't remember the exact change.So I think there are some definite performance and consistent benefits here. But we likely need to rework some things around raw
and returning
. I also think this warrants a major update. We can eliminate some unused methods, unused params, etc. I know I goofed on some of the returning
stuff and it needs some work. Let me know what you think @fratzinger.
I was going back and forth on the sequelize.returning
vs. params.$returning
/params.returning
thing and ended up on saying, that it should be sequelize.returning
. While not perfectly happy with it, it is more consistent than having literally two options for the same thing.
I removed the sequelize.returning
option for single calls as discussed before. I changed the tests accordingly.
I hate the magic that comes with instance.update
. In 'single patch' I added a new test to see if we need to add $select: undefined
to the first _get(id, { raw: false }
call. Turns out the single patch is not working at all! I debugged the new test in sequelize source code (https://github.com/sequelize/sequelize/blob/main/packages/core/src/model.js#L4304). options.fields
is not ['name']
but an empty array... It does not detect the correct change. We could manually set the option fields: Object.keys(values)
but it feels like a weird hack that should be avoided in the first place.
@DaddyWarbucks wdyt? Please review this test https://github.com/feathersjs-ecosystem/feathers-sequelize/blob/perf/id-and-in/test/index.test.ts#L394 . Even if you remove the params containing $select
the result is not updated. I would like to go back to Model.update
instead of instance.update
.
I can imagine there are more quirks in the new changes. We probably should add a few more tests or at least release this PR as a pre-release.
@fratzinger Can you elaborate on what magic comes with instance.update
that wasn't there before (or that we don't want now)?
I made one final change to the single patch, using instance.set
and instance.save
instead of instance.update
seems to work. Phew...that was annoying and weird. The whole "changed" thing is obviously inconsistent by what we just saw. I do like this approach because it saves us from having to re-lookup the result...but I am not married to it. I think we got TONS of other benefits on the changes we made.
I haven't forgotten about this @fratzinger. I am going to run this in Straightline next week and see how it goes. It uses the full gamut of features and should be a good test rig. I will also be able to get some quantifiable performance metrics out of it as well. The app uses feathers-profiler
and actually has a dashboard that shows a bunch of metrics from the profiler. So I can switch between versions and see what kind of differences we are getting.
Nice! This looks good! Thanks for the insight.
I thought about releasing it as a pre-major, just to be safe without any trouble. I'll test it on our app then as well.
Do you want to release it or shall I do?
I have not yet updated the instance.set
to not loop over keys. And I was just going to install it locally with npm link
.
I removed the applyScope
method that had been marked as deprecated because we are making this a major release.
I made a few updates to paramsToAdapter
such that it does not use offset
or order
and sets limit: 1
when an ID is present. Otherwise the user could have done something like service.get(1, { query: { $skip: 1 } })
which probably shouldn't be allowed. And the DB should optimize to limit 1 result when using proper unique/indexed keys, but I still like the query and intent being more clear. I wrote a test for this too.
Local testing results
_find
method to break up the Model.count
and Model.findAndCountAll
works well. I was previously unsure if Sequelize would actually issue the DB query if $limit: 0
. But it does. So we are no longer wasting a DB query when using $limit: 0
. Nice!More testing to come later tonight or tomorrow!
I think this is ready to go. I installed it locally and stepped through a bunch of stuff and everything seems to be working as expected. Checking out the sequelize logs I could determine that counts, additional lookups, etc were doing what the changes intended.
I also updated the docs and added a few more tests.
@fratzinger Are you aware that when using instance.update
sequelize may not actually update the record if it does not detect a change? That was a surprise to me.
For example
const result = await app.service('items').get(1);
const updated = await app.service('items').update(1, result);
const updated = await app.service('items').patch(1, { name: result.name });
Sequelize detects that noting changed via its set
(or update
?) method and therefore does not issue the UPDATE
query. I am not sure if this was happening with the current master
branch because it used some different methods. But in the current branch if you do an update
or patch
with the same data as the current...no UPDATE
query is issued.
I noticed this while testing. I ran 100 updates and 100 patches where I was just mutating same values. Updates took substantially longer. For example
const result = await app.service('items').get(1);
for (let i; i < 100; i++) {
await app.service('items').update(result.id, result)
}
for (let i; i < 100; i++) {
await app.service('items').patch(result.id, result)
}
I didn't understand why updates where taking sooo much longer. It's because of how previous (to these latest commits) we would do a Model.build
for update
. So the instance had no way to determine what had changed and was therefore issuing the UPDATE
query. But the patch
method, where we would get the current instance and then instance.set(values)
was able to noop because nothing changed. I then updated the update
method to be more similar to patch
where we get the current instance (instead of doing a Model.count
and Model.build
) and achieved the same results.
I was able to confirm this by then running the following queries and noticed that UPDATE
was issued for each. Furthermore, the UPDATE
query only updated the name
property.
const result = await app.service('items').get(1);
for (let i; i < 100; i++) {
await app.service('items').update(result.id, {
...result,
name: `${result.name}-${i}`
})
}
for (let i; i < 100; i++) {
await app.service('items').patch(result.id, {
...result,
name: `${result.name}-${i}`
})
}
This seems weird and dangerous to me...but its how sequelize is meant to be used I guess. It's very fast/powerful with how it either noops or only updates the actual changed properties, buts its magic for sure. What do you think?
@fratzinger Pinging you again to get your thoughts.
Released as v8.0.0-pre.0
🎉
Included in this PR:
BREAKING:
service.applyScope()
toservice.ModelWithScope()
service._getOrFind()
method. Useservice._get()
orservice._find()
directly.params.$returning = false
. Useparams.sequelize.returning = false
instead (for bulk operations).patch
andremove
functions now use singleinstance.update
instead of staticModel.update
. So the single sequelize hooks get called.Other changes
paramsToAdapter
: ifid
is not null, addid
to the query intelligently.query[this.id] === id
we can skip adding the id again!(this.id in query)
we can just setquery[this.id] = id
safelyquery[this.id]
is guaranteed to be something. check for$and
and add to$and
patch
andremove
: Don't use[this.id]: { [this.Op.in]: ids }
ifids.length === 1
. Use[this.id]: ids[0]
insteadpatch
andremove
: If the first_find
call does not return any items (ids.length === 0
), return instantlyObject.assign
to object spreadservice.Op
. Use direct import from sequelize instead:import { Op } from 'sequelize';
Issues:
This is a coproduction of @DaddyWarbucks & @fratzinger. Thanks @DaddyWarbucks. This was fun!