Open julealgon opened 3 years ago
Hi @julealgon,
Good afternoon.
Please refer to the page .NET: Object Persistence Model -> Supported Data Types for the list of supported data types. Looks like Guid
is not the supported data type. Refer to page Mapping Arbitrary Data with DynamoDB Using the AWS SDK for .NET Object Persistence Model for example on how support arbitrary data types. Basically it involves developing custom TypeCpnverter
.
As a workaround, you may use string
instead of Guid
type for Id
property. For using Guid
type, modifying your code to below works fine:
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.DataModel;
using Amazon.DynamoDBv2.DocumentModel;
using System;
namespace DynamoDbSecondaryIndexTest1748
{
class Program
{
static void Main(string[] args)
{
AmazonDynamoDBClient amazonDynamoDBClient = new AmazonDynamoDBClient();
DynamoDBContext dynamoDBContext = new DynamoDBContext(amazonDynamoDBClient);
var loadResult = dynamoDBContext.LoadAsync<Something>("75f05566-2209-4f3d-ba5a-cfd9da5eab62", new DynamoDBOperationConfig { IndexName = "Name-index" }).GetAwaiter().GetResult();
Console.WriteLine("Loaded Result: " + loadResult);
}
}
[DynamoDBTable("TestTable")]
internal sealed class Something
{
[DynamoDBHashKey]
[DynamoDBProperty(typeof(GuidTypeConverter))]
public Guid Id { get; set; }
[DynamoDBGlobalSecondaryIndexHashKey]
public string Name { get; set; }
public override string ToString()
{
return $"[Id: {Id}, Name: {Name}]";
}
}
public class GuidTypeConverter : IPropertyConverter
{
public DynamoDBEntry ToEntry(object value)
{
if(value == null || !(value is Guid)) throw new ArgumentOutOfRangeException();
DynamoDBEntry entry = new Primitive
{
Value = ((Guid)value).ToString()
};
return entry;
}
public object FromEntry(DynamoDBEntry entry)
{
Primitive primitive = entry as Primitive;
Guid guidValue;
if (primitive == null || !(primitive.Value is String) || string.IsNullOrEmpty((string)primitive.Value) || !Guid.TryParse((string)primitive.Value, out guidValue))
throw new ArgumentOutOfRangeException();
return guidValue;
}
}
}
NOTE: The above code should be used only for reference. Kindly validate and test the GuidTypeConverter
logic before trying to use the Guid type converter. We are not responsible for any data loss or production issues.
Please let me know if this helps and if this issue could be closed.
Thanks, Ashish
@ashishdhingra I believe you might have misunderstood my request here. Please note that Guid
is working fine when loading and saving entities. I have both the primary key, as well as another property on my model, mapped as Guid
.
The problem here happens with the global secondary index, which has a string
mapped property. Once I try to use LoadAsync
passing a string
value and the name of the secondary index, I get the exception above which is attempting to convert my string
argument into Guid
(which does not make sense since I'm not querying on the primary index).
I kindly request that you take another look at this. Even if the page you provided doesn't mention Guid
as a supported type, I believe the issue here is not directly related to that.
Also @ashishdhingra , I'd suggest to avoid providing example source code where you block on async code, as that is a bad practice that can lead to hard to debug problems in .Net. In your sample Main
, you could just make Main
itself async.
People not aware of the issues might end up naively copying those code sections into their code.
Hi @julealgon,
Good morning.
Thanks for your inputs. I used your code exactly where it tries to use the secondary index, to reproduce the issue. When not using the type converter, the QueryAsync
call didn't throw any exception, but it also didn't returned any result. The LoadAsync
call threw the type cast exception mentioned. After the use of type converter, the LoadAsync
call (with the secondary index as 2nd parameter), worked properly. I will try to pinpoint the exact code in SDK which throws the error.
In the mean while, in order to get proper understanding of the issue, please share the following:
Exact DynamoDB column configuration in AWS Console and the sample code which is working fine for you when loading and saving entities, without use of a type converter.
Reconfirm what you meant by The problem here happens with the global secondary index, which has a string mapped property. Once I try to use LoadAsync passing a string value and the name of the secondary index,
. I guess you are referring to the code below (which is what I used exactly):
dynamoDBContext.LoadAsync<Something>("75f05566-2209-4f3d-ba5a-cfd9da5eab62", new DynamoDBOperationConfig { IndexName = "Name-index" })
Regarding the example code where I'm blocking async
call, that is just for reference purposes as mentioned in the note. I agree with your assertion that we should not block async
calls in live production code.
Thanks, Ashish
@ashishdhingra , can you clarify on this?
After the use of type converter, the LoadAsync call (with the secondary index as 2nd parameter), worked properly.
Do you mean you can use LoadAsync
with the secondary index and actually return the data, or is just not returning anything? If it is not returning anything even though there is data with that value in the secondary index, then it confirms my suspicion that it is actually ignoring the IndexName
setting and actually searching on the primary index instead.
In the mean while, in order to get proper understanding of the issue, please share the following:
- Exact DynamoDB column configuration in AWS Console and the sample code which is working fine for you when loading and saving entities, without use of a type converter.
Primary index:
Id (string)
Global Secondary index:
Name-index
Name (string)
Id, Name
This configuration allows me to:
LoadAsync
with the primary key (.Net Guid
): objects are fetched properlySave
with my entity: entity is persisted properly, including Guid
ID which I generate before the callQueryAsync().GetNextSetAsync().SingleOrDefault()
with the secondary key + index name to fetch the entity using the secondary indexI cannot currently use LoadAsync
to fecth from the secondary index.
- Reconfirm what you meant by The problem here happens with the global secondary index, which has a string mapped property. Once I try to use LoadAsync passing a string value and the name of the secondary index,. I guess you are referring to the code below (which is what I used exactly):
Affirmative. That call produces InvalidCastException
which I believe is incorrect behavior: it should not be attempting to cast the string into a Guid
since the secondary index property that is marked in my model is not a Guid
, but a string
property:
...
[DynamoDBHashKey]
public Guid Id { get; set; }
[DynamoDBGlobalSecondaryIndexHashKey]
public string? Name { get; set; }
...
nor should it require the creation of a converter from string
to Guid
as Guid
works out of the box.
I performed another test on my end which I figured could potentially fix the issue.
I realized that the DynamoDBGlobalSecondaryIndexHashKey
attribute has an index name property, so I changed my model to:
[DynamoDBHashKey]
public Guid Id { get; set; }
[DynamoDBGlobalSecondaryIndexHashKey("Name-index")]
public string? Name { get; set; }
However, I still get the same error when attempting to use LoadAsync
.
Hi @julealgon,
Upon further investigating and using the code below (NOTE: To keep things simple, I used .GetAwaiter().GetResult()
):
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.DataModel;
using Amazon.DynamoDBv2.DocumentModel;
using System;
namespace DynamoDbSecondaryIndexTest1748
{
class Program
{
static void Main(string[] args)
{
AmazonDynamoDBClient amazonDynamoDBClient = new AmazonDynamoDBClient();
DynamoDBContext dynamoDBContext = new DynamoDBContext(amazonDynamoDBClient);
var loadResultSecondary = dynamoDBContext.LoadAsync<Something>("87a8f81a-9fe9-48cb-a7c0-a7a6788d91d3", new DynamoDBOperationConfig { IndexName = "Name-index" }).GetAwaiter().GetResult();
Console.WriteLine("Loaded Result: " + loadResultSecondary);
}
}
[DynamoDBTable("TestTable")]
internal sealed class Something
{
[DynamoDBHashKey]
public Guid? Id { get; set; }
[DynamoDBGlobalSecondaryIndexHashKey]
public string Name { get; set; }
public override string ToString()
{
return $"[Id: {Id}, Name: {Name}]";
}
}
}
it gave the mentioned error, whether I used dynamoDBContext.LoadAsync<Something>("<some_guid_string_value>", new DynamoDBOperationConfig { IndexName = "Name-index" })
or dynamoDBContext.LoadAsync<Something>("<some_guid_string_value>")
. Using the below stack trace:
System.InvalidCastException
HResult=0x80004002
Message=Unable to cast object of type 'System.String' to type 'System.Guid'.
Source=AWSSDK.DynamoDBv2
StackTrace:
at Amazon.DynamoDBv2.Converter`1.TryTo(Object value, Primitive& p)
at Amazon.DynamoDBv2.Converter.TryToEntry(Object value, DynamoDBEntry& entry)
at Amazon.DynamoDBv2.Converter.ToEntry(Object value)
at Amazon.DynamoDBv2.DynamoDBEntryConversion.ConvertToEntry(Type inputType, Object value)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.ToDynamoDBEntry(SimplePropertyStorage propertyStorage, Object value, DynamoDBFlatConfig flatConfig, Boolean canReturnScalarInsteadOfList)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.ToDynamoDBEntry(SimplePropertyStorage propertyStorage, Object value, DynamoDBFlatConfig flatConfig)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.ValueToDynamoDBEntry(PropertyStorage propertyStorage, Object value, DynamoDBFlatConfig flatConfig)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.MakeKey(Object hashKey, Object rangeKey, ItemStorageConfig storageConfig, DynamoDBFlatConfig flatConfig)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.LoadHelperAsync[T](Object hashKey, Object rangeKey, DynamoDBOperationConfig operationConfig, CancellationToken cancellationToken)
at Amazon.DynamoDBv2.DataModel.DynamoDBContext.LoadAsync[T](Object hashKey, DynamoDBOperationConfig operationConfig, CancellationToken cancellationToken)
at DynamoDbSecondaryIndexTest1748.Program.Main(String[] args) in C:\Users\ashdhin\source\repos\DynamoDbSecondaryIndexTest1748\DynamoDbSecondaryIndexTest1748\Program.cs:line 19
I was able to trace the call to TryTo. Basically when you pass string value, it tries to use (Guid)<string_value>
to get the Guid
value type. This call fails giving error Cannot convert type 'string' to 'System.Guid'
. You may try to validate this by executing (Guid)"87a8f81a-9fe9-48cb-a7c0-a7a6788d91d3"
in Visual Studio immediate window, you will get the same error.
The workaround here is to execute the call dynamoDBContext.LoadAsync<Something>(Guid.Parse("87a8f81a-9fe9-48cb-a7c0-a7a6788d91d3"), new DynamoDBOperationConfig { IndexName = "Name-index" })
instead. Notice the use of Guid.Parse()
. This works successfully without using type converter.
Please let me know if this works for you and let me know if this issue could be closed.
Thanks, Ashish
@ashishdhingra
Basically when you pass string value, it tries to use (Guid)
to get the Guid value type.
Why though? It shouldn't be trying to do that, since the index I'm querying does not have a Guid
as the hashkey: it is a string
.
The workaround here is to execute the call dynamoDBContext.LoadAsync
(Guid.Parse("87a8f81a-9fe9-48cb-a7c0-a7a6788d91d3"), new DynamoDBOperationConfig { IndexName = "Name-index" }) instead. Notice the use of Guid.Parse(). This works successfully without using type converter.
This is not a workaround, this is a different query altogether. Please note that the string Name
property which is the hash key for the secondary index, is not a guid: it is actually textual data entered into the system by users, so it is mapped as string
both on Dynamo as well as on my C# model class. This is contrary to the primary index hash key, which is a Guid
and is mapped in Dynamo as string
, and in code as Guid
. The primary index Load
is working fine, we are talking about the secondary index here.
I hope that clarifies the issue. Let me know if you need additional info.
Hi @julealgon,
The LoadAsync
is part of DynamoDB Object Persistence model (similar to ORM frameworks for relational databases). For the mentioned problem, you have used Guild
at C# object level instead of string
at DynamoDb level. DynamoDB Object Persistence model can map arbitrary data types to DynamoDB properties via type converters. Also notice that first parameter of LoadAsync
is of type object
. Hence before executing the query, it tries to map the object
type value to proper object (Something
) property type (Guid
) value. The cast (T)<some_value>
is a valid call since DynamoDb out of box only supports simple types as mentioned earlier. Hope this gives some insights into 1st question. Kindly note that Object Persistence Model is designed to work with arbitrary types. Feel feel to explore SDK code to find more details of Object Persistence design.
For the 2nd question, DynamoDB SDK's LoadAsync
method calls GetItem service API operation. If you refer Service API reference for GetItem, it expects Key request parameter which is a A map of attribute names to AttributeValue objects, representing the primary key of the item to retrieve.
. Hence in your case, it should be the Guid object property. The .NET API reference for DynamoDBContext.LoadAsync also specifies the first parameter as Hash key element of the target item.
. I'm not sure on how using 2nd parameter with secondary index name affects the GetItem
service operation.
I'm not sure if for your use case (per your 2nd question), you intend to use Query
operation.
Hope this helps.
Thanks, Ashish
The LoadAsync is part of DynamoDB Object Persistence model (similar to ORM frameworks for relational databases). For the mentioned problem, you have used Guild at C# object level instead of string at DynamoDb level.
Incorrect. The problem I'm having is not with the primary key (which is mapped as Guid
in C# code) but with the hash key for the secondary index, which is string
on both ends.
DynamoDB Object Persistence model can map arbitrary data types to DynamoDB properties via type converters.
I understand that, however the Guid
primary key is working fine for me without having to create any converters. I think you should probably update your documentation to clarify the behavior around Guid
s in the Object Persistence Model, since they appear supported even though they are not listed there.
Also notice that first parameter of LoadAsync is of type object. Hence before executing the query, it tries to map the object type value to proper object (Something) property type (Guid) value.
I understand the error, but I don't understand why it is attempting to perform that conversion in the first place. The type it should be attempting to cast to for the secondary index is string
, not Guid
: I'm not performing an operation on the primary index here when specifying IndexName
.
For the 2nd question, DynamoDB SDK's LoadAsync method calls GetItem service API operation. If you refer Service API reference for GetItem, it expects Key request parameter which is a A map of attribute names to AttributeValue objects, representing the primary key of the item to retrieve.
Now this is where it gets incredibly misleading then. The LoadAsync
API names the first parameter hashKey
, not primaryKey
. Following that nomenclature, one would understand it is "the hashkey for the index being searched", since:
If it truly is referring to the primary hashkey, then the API is incredibly misleading an needs work, in my opinion. Especially when directly compared to "sibling" APIs like QueryAsync
, which takes the same parameters, but actually works on the index you specify and not only on the primary index.
I'm not sure if for your use case (per your 2nd question), you intend to use Query operation.
From the API surface, it would appear that LoadAsync
should work on my scenario, but it doesn't. As I mentioned in the OP, I think either the API should be changed to support the scenario, or it should prohibit you from passing an index name into the method. The former is preferred in my view for API consistency (working similarly to QueryAsync
).
Hi @julealgon,
With all above inputs I will work with developer to have a look at it, even though it might be a .NET API documentation update.
Kindly notice that the GetItem service API call returns single item, as opposed to Query service API call which returns multiple items. Hence GetItem should really work on primary key which is unique for all items. Please validate if this explains the behaviour of LoadAsync.
NOTE:
Initiates the asynchronous execution of the Load operation. Amazon.DynamoDBv2.DataModel.DynamoDBContext.Load``1(System.Object)
.Loads an object from DynamoDB for the given hash-and-range primary key and using the given config.
.I will add bug
label back to get the traction.
Thanks, Ashish
...even though it might be a .NET API documentation update.
Again, while changing the documentation to make this behavior explicit is a valid "fix", I personally think it is a bad option in this case as the APIs will "appear" to behave the same but work differently at the end of the day, and this is bad for consumers. Obviously, this is up to you, just wanted to make my position clear here.
Kindly notice that the GetItem service API call returns single item, as opposed to Query service API call which returns multiple items. Hence GetItem should really work on primary key which is unique for all items. Please validate if this explains the behaviour of LoadAsync.
Perhaps we are confusing "primary/secondary" with "hash/sort" keys here? My understanding was that secondary indexes also had unique items under their respective hash keys, so the approach shouldn't change regardless if we are talking about single-result Load
vs multi-result Query
: using Load
with a hash key from a secondary index should have similar behavior than performing the same call using the hash key for a primary index, i.e. it should always have "at most 1 result" in both cases.
Please do correct me if I'm wrong here as I just started interfacing with Dynamo.
API documentation for DynamoDBContext.Load (object) call which states Loads an object from DynamoDB for the given hash-and-range primary key and using the given config..
I never noticed the XML docs for the non-async call as I was always using the async version by default. This documentation does seem to suggest it only works on the primary index (having the primary key
mentioned there), which would then result in a very misleading API as I said earlier since you can pass any other index name that is not the primary one as part of the call.
If the documentation is really correct here and this is not a bug, then I would strongly recommend removing the capability of passing the index name in the Load/LoadAsync
calls. At least that will move developers to the pit of success faster.
...I will work with developer to have a look at it,... I will add bug label back to get the traction.
I appreciate your support on this Ashish.
@julealgon Adding more context here. As @ashishdhingra mentioned LoadAsync
is basically a wrapper around the underlying GetItem Service API. Basically it calls that and takes care of converting the low level service client response into the POCO. All of the operations on the DynamoDBContext
have an override that take in the same DynamoDBOperationConfig
type. I admit not my favorite design as it caused DynamoDBOperationConfig
to be a union of all possible config settings. That means some settings like IndexName
won't always be applicable. At this point changing this would be a serious breaking change which we are not ready to make.
I agree the parameter name on the LoadAsync
would be clearer if it was called primary key. For background when this code was first written DynamoDB didn't have indexes yet and this just wasn't something we thought about going back and renaming.
As for the idea of making LoadAsync
work like QueryAsync
except for a single object I think that could also cause serious confusion. A hash and range key for an index is not guaranteed to be unique in DynamoDB. So if we got multiple items back we would have to either throw an error which could be very unexpected or return back the first item which would be unpredictable.
At this point the best we can do is improve the documentation. Anything else causes serious breaking changes.
Thank you @normj for the information.
At this point changing this would be a serious breaking change which we are not ready to make. At this point the best we can do is improve the documentation. Anything else causes serious breaking changes.
If you are not willing to change the behavior of the API, perhaps you could introduce new overloads, with a tighter config model (containing only the settings that make sense for any given operation) while retaining the old overload as-is, but marking it as deprecated
. That would allow you to slowly phase it out and replace with the newer, more robust one.
Another potential option would be to actually throw an exception at runtime when an invalid setting is passed into the call. I'd rather get an exception that tells me why that setting cannot be used, than get "wrong results" as is happening right now.
Anyways, very unfortunate to have such a misleading API stuck in place. I'll probably need a significant comment block in my code to explain this anomaly and justify using the Query
syntax in my repository implementation. I hope you can look back at this at some point and rethink the strategy (maybe implement an EF provider instead of a custom model, like CosmosDB does and some paid providers provide). Clearly, the "wheel reimplementation" here created unnecessary problems for consumers.
Just spent a day banging my head into the wall trying to LoadAsync()
from a GSI having only a hash (but no sort) key which means there must be exactly one result matching the key. Having to use QueryAsync().GetRemainingAsync().Single()
in such case is not very intuitive.
We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.
We have noticed this issue has not received attention in 1 year. We will close this issue for now. If you think this is in error, please feel free to comment and reopen the issue.
From my perspective, this is still valid, even if its a documentation-only change.
Needs review on how we could implement this documentation change.
Wow, I just spent hours trying to work out what I was doing wrong! This really needs to be included in the documentation if it's not going to be fixed. It would've saved me so much headache!
For the short term, we've clarified the Load
/LoadAsync
documentation starting in DynamoDBv2 v3.7.303.9 via #3320 to explain that 1) it calls GetItem
under the hood, which relies on the primary key and 2) IndexName
on the config does not influence which item is loaded.
Longer term, we would like to make the change @julealgon suggested above - move away from the "shared" DynamoDBOperationConfig
towards operation-specific configs that only contain applicable settings for each. You may have noticed the v4-development
branch, this is on our list of desired DynamoDB changes along with the interface/mockability issues.
We've merged #3421 into the v4-development
branch, which includes a new set of operation-specific config objects LoadConfig
, SaveConfig
, QueryConfig
, etc. This should help avoid confusion about IndexName
and other properties that only apply to a subset of operations moving forward.
We've marked the methods that take DynamoDBOperationConfig
as deprecated in V4, but are not planning to remove them entirely yet. We may do so in a future major version.
We're still working on getting preview packages published, you can track overall V4 progress in #3362. We'll leave this issue open until at least V4 hits GA.
Description
When using the
LoadAsync
method onIDynamoDbContext
to load an entity using a secondary global index, I'm getting unexpected behavior: anInvalidCastException
is being thrown.Switching the exact same call to a
QueryAsync
fixes the problem.I believe
LoadAsync
is not respecting theIndexName
property on the configuration object and is using the primary index for all searches. Either the API is incredibly misleading (as it allows you to pass this unused value) or there is a bug there.Reproduction Steps
Guid
propertystring
propertyLoadAsync
with astring
key and passnew DynamoDBOperationConfig { IndexName = <Name_Of_Secondary_Index> }
as optionsInvalidCastException
is thrownLoadAsync
call withQueryAsync
, and it works as expected:Logs
Environment
AWSSDK.Core
Version="3.5.1.39"AWSSDK.DynamoDBv2
Version="3.5.3.3"Resolution
I belive
LoadAsync
should be changed to respect the index being passed as a parameter and use theobject hashKey
against that.This is a :bug: bug-report