AzureCosmosDB / data-migration-desktop-tool

MIT License
129 stars 53 forks source link

An Id field is automatically renamed to id #92

Closed schjan closed 1 year ago

schjan commented 1 year ago

When using CosmosDB together with Entity Framework Core it is common to have an Id field, aswell as the CosmosDB builtin id field. See https://learn.microsoft.com/en-us/ef/core/providers/cosmos/?tabs=dotnet-core-cli#embedded-entities for reference.

So we have plenty of entities of the following structure stored in CosmosDB:

{
  "id": "Foo|1",
  "Id": "1"
  ...
}

Exporting them to JSON works as expected. But as soon, as we want to import the entities using the CosmosDB sink, the Id field gets removed, thus we can not use this tool to migrate data from one CosmosDB to another.

I digged into the codebase and found out that this part of the code seems like causing our issues: https://github.com/AzureCosmosDB/data-migration-desktop-tool/blob/main/Interfaces/Cosmos.DataTransfer.Interfaces/DataItemExtensions.cs#L29-L33

I think it would be nice if setting the requireStringId from the migrationsettings.json would be supported. Setting requireStringId = false fixes all our issues. By now it is hardcoded to true at https://github.com/AzureCosmosDB/data-migration-desktop-tool/blob/main/Extensions/Cosmos/Cosmos.DataTransfer.CosmosExtension/CosmosDataSinkExtension.cs#L96.

If it is in your interest I would work on a MR with the setting exposed via CosmosSinkSettings.

bowencode commented 1 year ago

requireStringId is meant to express the requirements of the data store (i.e. Cosmos vs JSON), not a specific job. I think an additional parameter to that method that could come from a setting would be more appropriate, something like useCaseSensitiveIdField defaulted to false. Otherwise I think this is a great suggestion to fix the issue that you ran into. Tag me if you set up a PR for it.