Closed nonken closed 7 months ago
I've also noticed I'm having the same issue, I feel like this isn't the expected behaviour. The condition
implies that it applies to the whole index, and the composite
helper could be used to fill in the missing gaps for the other attribute
Hi @nonken π
You and @sam3d are correct, this isn't the best behavior in this case. As mentioned in @sam3d's post you can use the composite
method as a stop gap [example], but this is something that should be addressed. I will review this area of code and see what can be done to change this π
@nonken
I dug into this today, and I have a fix incoming, but I wanted to bring up a logical issue with your example. It's possible this could be a shorted sided API consideration with the condition
function.
The example update you provided looks like this:
entry
.patch({ id: '123', organizationId: '123' })
.set({ effectiveAt: 'n/a', accountId: "123", settledAt: "today" })
.go();
The expectation above is that the patch()
operation will write a pk
and sk
to the entriesBySettledDate
index but not the entriesByEffectiveDate
index. The user's goal is to keep In these indexes "sparse" unless the value equals "n/a".
In my fix this will be true, however the implementation of this strategy is not sound with the model provided. In its current form, the condition
callback only determines whether or not an index will be set
. What it won't do, currently, is "drop" an index if the condition fails.
For example (using your entity), if you create an item with effectiveAt: "today"
then that index will be added. Later, if you perform an update set
on that item with the value effectiveAt: "n/a"
, then ElectroDB will not write to that index. The problem, however, is that that index will still exist and it contain a stale value (effectiveAt: "today"
). You would then get effectiveAt: "n/a"
values back from that index π
@sam3d My gut is saying if an index condition
returns false
on update()
, patch()
, or upsert()
ElectroDB should actually "drop" that index. Would you agree? I believe this is a pretty critical oversight on my part.
@tywalch I think you're completely right. Similar to how a composite key is re-computed if any of its attributes change, the condition
property behaves in a similar way.
What's interesting is that it's more akin to the watch
property on attributes, where it technically only wants to re-compute if dependent attributes are modified. The composite keys can do this because their dependent attributes are known, but condition
doesn't have the same benefit. If there's an index condition
it would need to be checked on every operation against that entity.
There's an issue with this β without knowing dependent attributes or the prior state of the entity, during an update operation we don't know when the condition
state transitions between false
and true
.
true
-> false
βΒ This is actually OK (and what you're describing above). There is a bit of overhead in that we'll always be dropping the index's attributes, even if they were already not set. But that's probably more something for the documentation than it is an actual issue with the implementation.false
-> true
β This one is harder. We don't know the previous state of the index, so we have to also set this index's attributes every time condition
evaluates to true
, even if we didn't touch any of those attributes.Not 100% sure on a good solution here. A few ideas off the top of my head:
condition
(like with watch
on attributes) so we know when to re-compute the presence of the index. This would retain the current flexibility and fix index behaviour on mutations, with some added complexity for the entity index definitioncondition
to process the attributes defined on the index
so if we're re-computing the index we know to also check for its condition. The downside is that this is scoped to the whole index, and not to the individual keys, so we couldn't update solely GSI1PK without also having to update GSI1SK if condition
evaluates to true
condition
to return more than just a boolean? No clue on what the API for this would look like, but perhaps condition
could look at the attributes its passed in the mutation and make the decision to either false
(delete the index), true
(create the index), or null
(don't try to drop or create the index):condition: (attr) => {
// effectiveAt not changed, don't create OR drop the index
if (attr.effectiveAt === undefined) return null;
// Drop the index if needed
if (attr.effectiveAt === "n/a") return false;
// Otherwise re-create the index
return true;
},
Either way, if we then undergo a state transition because a dependent attribute changed, then the usual .composite()
helper can take over to fill out the index as needed (and error if not provided). And we know we did need to do that because the dependent attributes changed.
Actually wait I've been thinking about this some more, this might be more complex than I thought.
Going from true
/false
-> false
is always okay, because if condition
is false
then the indexes can always just be dropped and aren't dependent on the previous state of those attributes.
In the case of the condition going from false
-> true
, then your example makes sense. However, I feel like without some way to model that state transition, then we can't differentiate between false
-> true
and true
-> true
. Which is really annoying. We only want to set those index attributes when the condition flips (or if the attributes change), but without some way to model that then all of the index attributes must be set every single time.
Is it maybe possible to do some kind of check where if the condition
evaluates to true
then it applies an extra condition expression to ensure that the entity has the attributes for that index? And then it'll fail and force the use of the composite
helper in order to perform the update?
I mulled it around a bit today, but I think I got it; let me know what you think:
condition
callback will be invoked only when a composite attribute associated with an index is set
via an update
, patch
, or upsert
. [existing behavior]condition
callback is provided the attributes being set
on that particular operation, including the item's identifying composite attributes. [existing behavior]condition
callback returns true
, ElectroDB will attempt to create the index and all of its associated keys. If an index cannot be created because an update
operation only has enough context for a partial key, ElectroDB will throw. [the original issue here, fixed]If the condition
callback returns false
, the index and all of its associated keys will be removed from the item. [new behavior]
Item #1 above is the key to solving the issue you bring up in your first comment, and it's actually what we do currently. This means that condition
would only be called when an index must be recalculated. furthermore, as described in #3, ElectroDB will actually throw if your update
operation (set
and remove
) lacks a full composite context and would result in a "partial" key. This would mean that all * -> true
transitions are already validated to have all the composite parts necessary to recreate the complete index already.
Let me know if this doesn't make sense or I am missing anything. I am writing some tests now, but likely won't be making a PR until tomorrow. I'm definitely not looking to rush a change like this.
I think this makes sense! With the two caveats worth documenting:
condition
cannot be solely dependent on attributes outside of that composite index. This means that patterns like the following wouldn't work:{
condition: (attr) => !attr.deletedAt,
pk: { field: "GSI1PK", composite: [] },
sk: { field: "GSI1SK", composite: ["createdAt"] },
}
condition
cannot be updated unless all composite attributes are provided, not just the ones from the field. For example:{
pk: { field: "GSI1PK", composite: ["storeId"] },
sk: { field: "GSI1SK", composite: ["status", "createdAt"] },
}
// Updates GSI1PK, no problem
entity.patch({ id: "29381" }).set({ storeId: "SJR" }).go();
But in order to perform the same operation with a condition
:
{
condition: (attr) => attr.status !== "deleted",
pk: { field: "GSI1PK", composite: ["storeId"] },
sk: { field: "GSI1SK", composite: ["status", "createdAt"] },
};
// Condition returns true, so we need to set GSI1PK and GSI1SK in case the index
// doesn't already exist. We now MUST provide status and createdAt
entity.patch({ id: "29381" }).set({ storeId: "SJR" }).go();
// So this must be done instead every time
entity
.patch({ id: "29381" })
.set({ storeId: "SJR" })
.composite({ status: prevEntity.status, createdAt: prevEntity.createdAt })
.go();
DynamoDB doesn't propagate to a composite index unless both attributes are present, right? So to remove an entity from a sparse index you technically don't have to drop both, just one of them. Maybe the result of condition
only gets applied to the composite key field that's being mutated?
A condition cannot be solely dependent on attributes outside of that composite index. This means that patterns like the following wouldn't work.
My gut is that the user would determine if they wish to check attributes outside the composite; i.e. that Electro would continue to supply the condition
callback with all values being "set". I do agree it is necessary to reiterate in the docs that the attributes being passed represent the values provided in the operation and not the item as it is stored.
An index with a condition cannot be updated unless all composite attributes are provided, not just the ones from the field. For example: // Condition returns true, so we need to set GSI1PK and GSI1SK in case the index // doesn't already exist. We now MUST provide status and createdAt
This is a very, very good point. You're saying that because this could be the first time the condition
callback has returned true
, but the user only provided enough data on update to create one out of the two keys. In that case, it is also possible that the stored item actually has all the values necessary to form the other key. In this case the user has an expectation that the index will be written, but in reality, they now have an orphaned index record that is not consistent with the underlying attributes.
Great catch! This is sort of the opposite of the issue I brought up above, I had not considered this π€―
DynamoDB doesn't propagate to a composite index unless both attributes are present, right? So to remove an entity from a sparse index you technically don't have to drop both, just one of them. Maybe the result of condition only gets applied to the composite key field that's being mutated?
How might this work?
- An index with a condition cannot be updated unless all composite attributes are provided, not just the ones from the field. For example:
Do you think this problem becomes "solved" by adding this requirement? It sort of seems like it? That said, this will be difficult for newbies to understand, not to mention this would become a request time validation which makes it doubly hard for users to anticipate.
While it does feel like we're close to solving this, I don't want to be too precious about the current API for handling this. I haven't squared the circle in my mind, but this topic does feel like it could have a meaningful overlap with Ross Gerbasi's request for the ability to build
and drop
specific indexes. His need revolves around backfilling new created indexes on existing items.
I sat down recently and thought about what that API would look like. Creating an API for dropping an index was very straight forward, but I found an API for building a specific index had a lot complexities due to potential attribute dependencies. The simplest approach would be to simply accept composite values "as-is" (i.e. without any setter processing) and add an automatic eq
condition on each attribute provided. Additionally, this functionality wouldn't play nice with condition
; it would likely have to skip the check to work as anticipated.
Thinking through that feature had me thinking there must be some kind of third way. Does this spark anything in you?
Hey @tywalch so I must admit I haven't spent much time thinking about a broader API and requirements of the library. I have no doubt this is much more complex. We would be good with a very specific Entity
reindex
method. Expectations would be I provide all attributes to write all indexes or provide some parameters to just target a few or one index.
The use case we are looking to handle is one where attributes already exist in an Entity
, so all the data is there, we just need to have better access to it. So we will be scanning the entity and running this process against all items. Maybe electro can even provide help here with a utility to reindex all of an Entity quickly. A combination of concurrent PUTs and queuing for tables with millions of items. The other tool we've used for this is Dynobase but its import is slow compared to a custom script.
Not sure how much this connects with the condition
conversation here. However there are likely all sorts of other use cases for index scenarios. Another obvious one would be attributes that do not exist are being added and indexed.
We would also need to keep our createdAt
and updatedAt
dates. Here's an example of how we configure these in our entities is below. Not sure what the best way forward to keep these values unchanged while we add/remove indexes
createdAt: {
type: 'string',
readOnly: true,
required: true,
default: getNowISO,
set: getNowISO,
},
updatedAt: {
type: 'string',
watch: '*',
required: true,
default: getNowISO,
set: getNowISO,
},
I ran into this issue today as well as I'm new to ElectroDB but using sparse index pretty heavy.
My use-case is similar to what have been said by others already.
state: 'ACTIVE' | 'INACTIVE'
and I just want to have ACTIVE ones in the GSI1 index and get removed once I update the state.
Any idea if there will a fix soon? Or is there any workaround?
Small update: this is still a WIP, I made significant progress this afternoon, still ironing out tests for edge cases π
GH Closed this issue because I merged a fix, but I'm reopening it to keep discussion open in case there are any issues.
@nonken You can see the impact by using the original playground link you provided π
@tywalch this is so good, thank you a million times for diving into this. Can we somehow donate, as in, does the sponsoring work for you?
@tywalch this is so good, thank you a million times for diving into this. Can we somehow donate, as in, does the sponsoring work for you?
No problem, glad it worked and I'm happy you raised this because it helps everyone!
Yup, I'm setup for sponsoring via GitHub's native sponsoring option (if that works) -- thank you so much for considering me for sponsorship!
Closing, but feel free to reopen!
This update unfortunately started causing issues for us. One example use case would be something like this
previously we had a sparse index for accessing chats by user id (see example entity definition below), and it worked well with v2.13.1. For instance, if a user logs in while having an active chatsession, the unauthenticated chat session is correctly added to the "byUser"-index when the "loggedInUserId" attribute is set (for unauthenticated users it is undefined).
It seems this is no longer supported, or is there some way to still achieve this? The upsert operation we used to create/update the message entries started failing, which was unexpected for a minor version upgrade.
I believe this is a valid use case for sparse indices, and it's unfortunate if the library no longer supports it.
{
model: {
entity: 'chatConversation',
version: '1.0',
service: 'chatbot',
},
attributes: {
conversationId: {
type: 'string',
required: true,
},
loggedInUserId: {
type: 'string',
required: false,
},
messages: {
type: 'list',
items: {
type: 'any',
},
required: true,
},
},
indexes: {
byConversationId: {
pk: {
field: 'pk',
composite: ['conversationId'],
casing: 'none',
},
sk: {
field: 'sk',
composite: [],
casing: 'none',
},
},
byUser: {
index: 'gsi1pk-gsi1sk-index',
type: 'clustered',
condition: (attr: any) => {
return !!attr.loggedInUserId;
},
pk: {
field: 'gsi1pk',
composite: ['loggedInUserId'],
casing: 'none',
},
sk: {
field: 'gsi1sk',
composite: ['conversationId'],
casing: 'none',
},
},
},
};
Hi @severi
Firstly, I want to say that I am sorry your existing code broke do to this fix. I try very hard to make sure this does not happen. Ultimately in choosing between a major
and minor
update, I chose a minor
update because the original implementation had such a significant implementation flaw it was not viable as it was.
As for how you might achieve what you had, I have two potential approaches for you:
Use get
and set
to apply an empty string when no loggedInUserId
is provided (example). This approach ideally won't require your app code to change.
You could simply pass an empty string for loggedInUserId
instead of omitting the value (example). This might require some change in your app code to account for an empty string vs undefined
Again, please accept my apology for the unexpected errors in your app, and let me know if any of the above will work for you; If the above won't work in your case, I can work to maybe create an escape hatch (via an execution option or some type of configuration) for this requirement.
Describe the bug I have the following index:
When updating an item using
.update
the resulting DynamoDB query only hasgsi3sk
set and notgsi3pk
. When making the same call using.upsert
it works, but I need to provide all available fields which requires me to fetch the item first.I dug in a bit deeper and it looks like that _getIndexImpact will not return the
pk
in my case because organizationId is not provided in the attributes. Thesk
does get set resulting in a record without apk
.It can be that I am not understanding the intend correctly, but my expectation was that this would work for update or patch calls.
ElectroDB Version 2.13.1
ElectroDB Playground Link Playground
Entity/Service Definitions Include your entity model (or a model that sufficiently recreates your issue) to help troubleshoot.
Expected behavior I am unsure what the correct way of using
condition
here is.