Closed nmehlei closed 6 months ago
I encountered a similar issue but it in my case it was with the _type
property. Mostly the _type
properties at the document root disappeared but not consistently. In a few documents they are still there.
I had the same problem happening. This could have been really dangerous for us. Is it something we missed in the migrationsettings?
Had the very same problem with the property named $type and it took us hours to understand what was happening because the lack of the field.
VERY VERY DANGEROUS BEHAVIOR!
Same problem. This tool would be great if this CRITICAL issue would be resolved.
Same problem. This tool would be great if this CRITICAL issue would be resolved.
Yes, it silently ignore and without warnings
The problem: It looks like by default Newtonsoft.Json is ignoring the '$type' property.
Troubleshooting:
I tried look into the code a little bit for this. If you tell Newtonsoft's JsonConvert.Deserialize
to handle the type properties via TypeNameHandling.All
then it attempts to deserialize the object as the type specified in the '$type' property, which will result in an exception since the library isn't there.
Initial Solution Thoughts:
It looks to me that the code for the extensions will need to be modified to use a custom JsonConverter that deserializes the Json to a Dictionary<string, object>
or something that's compatible with the rest of the code. It appears to me that all the extensions are handling Json the same way, at least the Json and Cosmos extensions from my checking, so we'll likely want to create our own JsonConverter
abstraction class that handles serialization and deserialization correctly, so that any extensions that need to can use it.
@jbowen-solliance What are your thoughts?
Can anyone seeing this issue provide a reproduction scenario: source and sink used, settings, a sample of data used? In running some basic tests of JSON->Cosmos-NoSQL and Cosmos-NoSQL->JSON I didn't see the $type property being dropped so there's something specific about these scenarios that I'm missing.
I tested with this data as "json-in_typed.json":
[
{
"$type": "System.Object",
"RealEstateType": "apartmentRent",
"Title": "Test221",
"MatterportUrl": ""
}
]
migrationsettings.json for import to Cosmos, including two different options for WriteMode that have different internal serialization behavior:
{
"Source": "json",
"Sink": "cosmos-nosql",
"SourceSettings": {
"FilePath": "json-in_typed.json"
},
"SinkSettings": {
"ConnectionString": "FILL IN",
"Database": "database",
"Container": "typed",
"PartitionKeyPath": "/id",
"RecreateContainer": true,
"WriteMode": "InsertStream"
//"WriteMode": "Insert"
}
}
migrationsettings.json for export from Cosmos:
{
"Source": "cosmos-nosql",
"Sink": "json",
"SourceSettings": {
"ConnectionString": "FILL IN",
//"IncludeMetadataFields": true,
"Database": "database",
"Container": "typed"
},
"SinkSettings": {
"FilePath": "json-out_typed.json",
"Indented": true
}
}
@JefSchraag: I assume your problem with "_" was on export from Cosmos, which excludes metadata properties (those with leading underscores) by default. The IncludeMetadataFields setting above changes that behavior.
For our issue, it happens when the $type was a sub-property
{ "id": "600004", "itemType": { "$type": "Test1.Domain.Test2.Types.ABC, Test1.Domain", "property1": false, "property2": false, "value": 3, "displayName": "ABC" },
We have the IncludeMetadataFields property set to true on the Source and Sink settings.
Our migrations picked up the _ properties of CosmosDb (e.g. _ts) at the top level once we started including the meta data flag.
Our use cases for the tool will be Cosmos-to-Cosmos, Cosmos-to-JSON and JSON-to-Cosmos transfers.
Let me know if there is anything else relevant to include to help with the investigation.
@JohnDStrasz can you confirm whether #122 fixes this for you? I included the sample you provided in my testing and confirmed the change during export but it would be good to make sure there's not something else you're seeing.
I pulled down your feature branch with the change, compiled, brought over my "known working" migrationsettings.json. However, I get the following stack trace when running with dmt.exe in your feature branch under the Core project.
Using Cosmos-nosql Source
Using Cosmos-nosql Sink
info: CosmosDataSourceExtension[0]
Reading from ActivityEventLog-dev.ActivityEventLog
fail: Cosmos.DataTransfer.Core.RunCommand.CommandHandler[0]
Data transfer failed
System.AggregateException: One or more errors occurred. (Encountered an unexpected JSON token.
ActivityId: d108b875-11aa-4916-9103-1bca939a04aa, Windows/10.0.19045 cosmos-netstandard-sdk/3.30.8)
---> Microsoft.Azure.Cosmos.Json.JsonUnexpectedTokenException: Encountered an unexpected JSON token.
ActivityId: d108b875-11aa-4916-9103-1bca939a04aa, Windows/10.0.19045 cosmos-netstandard-sdk/3.30.8
at Microsoft.Azure.Cosmos.Json.JsonReader.JsonTextReader.Read()
at Microsoft.Azure.Cosmos.Json.JsonNavigator.JsonTextNavigator.Parser.Parse(IJsonTextReaderPrivateImplementation jsonTextReader)
at Microsoft.Azure.Cosmos.Json.JsonNavigator.JsonTextNavigator.<>cDisplayClass3_0.<.ctor>g__CreateRootNode|0()
at Microsoft.Azure.Cosmos.Json.JsonNavigator.JsonTextNavigator..ctor(ReadOnlyMemory1 buffer) at Microsoft.Azure.Cosmos.Json.JsonNavigator.Create(ReadOnlyMemory
1 buffer)
at Microsoft.Azure.Cosmos.ContainerCore.GetPartitionKeyValueFromStreamAsync(Stream stream, ITrace trace, CancellationToken cancellation)
at Microsoft.Azure.Cosmos.ContainerCore.ExtractPartitionKeyAndProcessItemStreamAsync[T](Nullable1 partitionKey, String itemId, T item, OperationType operationType, ItemRequestOptions requestOptions, ITrace trace, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.ContainerCore.UpsertItemAsync[T](T item, ITrace trace, Nullable
1 partitionKey, ItemRequestOptions requestOptions, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.ClientContextCore.RunWithDiagnosticsHelperAsync[TResult](String containerName, String databaseName, OperationType operationType, ITrace trace, Func2 task, Func
2 openTelemetry, String operationName, RequestOptions requestOptions)
at Microsoft.Azure.Cosmos.ClientContextCore.OperationHelperWithRootTraceAsync[TResult](String operationName, String containerName, String databaseName, OperationType operationType, RequestOptions requestOptions, Func2 task, Func
2 openTelemetry, TraceComponent traceComponent, TraceLevel traceLevel)
at Cosmos.DataTransfer.CosmosExtension.CosmosDataSinkExtension.PopulateItem(Container container, ExpandoObject item, String partitionKeyPath, DataWriteMode mode, String itemId, CancellationToken cancellationToken) in C:\Temp\data-migration-desktop-tool-change\Extensions\Cosmos\Cosmos.DataTransfer.CosmosExtension\CosmosDataSinkExtension.cs:line 179
at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates
1 shouldRetryResultPredicates, Func5 onRetryAsync, Int32 permittedRetryCount, IEnumerable
1 sleepDurationsEnumerable, Func4 sleepDurationProvider, Boolean continueOnCapturedContext) at Polly.AsyncPolicy.ExecuteAsync[TResult](Func
3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task
1.get_Result()
at Cosmos.DataTransfer.CosmosExtension.CosmosDataSinkExtension.<>cDisplayClass4_0.1 t) in C:\Temp\data-migration-desktop-tool-change\Extensions\Cosmos\Cosmos.DataTransfer.CosmosExtension\CosmosDataSinkExtension.cs:line 138 at System.Threading.Tasks.ContinuationResultTaskFromResultTask
2.InnerInvoke()
at System.Threading.Tasks.Task.<>c.<.cctor>b272_0(Object obj)
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
at Cosmos.DataTransfer.CosmosExtension.CosmosDataSinkExtension.WriteAsync(IAsyncEnumerable1 dataItems, IConfiguration config, IDataSourceExtension dataSource, ILogger logger, CancellationToken cancellationToken) in C:\Temp\data-migration-desktop-tool-change\Extensions\Cosmos\Cosmos.DataTransfer.CosmosExtension\CosmosDataSinkExtension.cs:line 103 at Cosmos.DataTransfer.CosmosExtension.CosmosDataSinkExtension.WriteAsync(IAsyncEnumerable
1 dataItems, IConfiguration config, IDataSourceExtension dataSource, ILogger logger, CancellationToken cancellationToken) in C:\Temp\data-migration-desktop-tool-change\Extensions\Cosmos\Cosmos.DataTransfer.CosmosExtension\CosmosDataSinkExtension.cs:line 99
at Cosmos.DataTransfer.Core.RunCommand.CommandHandler.ExecuteDataTransferOperation(IDataSourceExtension source, IConfiguration sourceConfig, IDataSinkExtension sink, IConfiguration sinkConfig, CancellationToken cancellationToken) in C:\Temp\data-migration-desktop-tool-change\Core\Cosmos.DataTransfer.Core\RunCommand.cs:line 179
fail: CosmosDataSourceExtension[0]
Failed to connect to CosmosDB. Please check your connection settings and try again.
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at Microsoft.Azure.Cosmos.CosmosHttpClientCore.SendHttpHelperAsync(Func1 createRequestMessageAsync, ResourceType resourceType, HttpTimeoutPolicy timeoutPolicy, IClientSideRequestStatistics clientSideRequestStatistics, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeAsync(DocumentServiceRequest request, ResourceType resourceType, Uri physicalAddress, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.GatewayStoreModel.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.Handlers.TransportHandler.ProcessMessageAsync(RequestMessage request, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.Handlers.RouterHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(Func
1 callbackMethod, Func3 callShouldRetry, Func
3 callShouldRetryException, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)
at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(String resourceUriString, ResourceType resourceType, OperationType operationType, RequestOptions requestOptions, ContainerInternal cosmosContainerCore, FeedRange feedRange, Stream streamPayload, Action1 requestEnricher, ITrace trace, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.ContainerCore.ReadContainerAsync(ITrace trace, ContainerRequestOptions requestOptions, CancellationToken cancellationToken) at Microsoft.Azure.Cosmos.ClientContextCore.RunWithDiagnosticsHelperAsync[TResult](String containerName, String databaseName, OperationType operationType, ITrace trace, Func
2 task, Func2 openTelemetry, String operationName, RequestOptions requestOptions) Cancellation Token has expired: True. Learn more at: https://aka.ms/cosmosdb-tsg-request-timeout CosmosDiagnostics: {"Summary":{},"name":"ReadContainerAsync","start datetime":"2024-03-29T20:03:29.997Z","duration in milliseconds":72.9875,"data":{"Client Configuration":{"Client Created Time Utc":"2024-03-29T20:03:29.9972060Z","MachineId":"hashedMachineName:81f33379-6425-ed24-48a6-9b0b69dedffa","NumberOfClientsCreated":4,"NumberOfActiveClients":4,"ConnectionMode":"Gateway","User Agent":"cosmos-netstandard-sdk/3.34.0|4|X64|Microsoft Windows 10.0.19045|.NET 6.0.28|N|F 00000001|dmt-1.0.0.0--Cosmosnosql","ConnectionConfig":{"gw":"(cps:50, urto:10, p:False, httpf: False)","rntbd":"(cto: 5, icto: -1, mrpc: 30, mcpe: 65535, erd: True, pr: ReuseUnicastPort)","other":"(ed:False, be:True)"},"ConsistencyConfig":"(consistency: NotSet, prgns:[], apprgn: )","ProcessorCount":6}},"children":[{"name":"Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler","duration in milliseconds":71.4986,"children":[{"name":"Waiting for Initialization of client to complete","duration in milliseconds":69.962},{"name":"Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler","duration in milliseconds":1.3844,"children":[{"name":"Microsoft.Azure.Cosmos.Handlers.RetryHandler","duration in milliseconds":1.277,"children":[{"name":"Microsoft.Azure.Cosmos.Handlers.RouterHandler","duration in milliseconds":1.1061,"children":[{"name":"Microsoft.Azure.Cosmos.Handlers.TransportHandler","duration in milliseconds":1.0423,"children":[{"name":"Microsoft.Azure.Cosmos.GatewayStoreModel Transport Request","duration in milliseconds":0.725,"data":{"Client Side Request Stats":{"Id":"AggregatedClientSideRequestStatistics","ContactedReplicas":[],"RegionsContacted":[],"FailedReplicas":[],"AddressResolutionStatistics":[],"StoreResponseStatistics":[]}}}]}]}]}]}]},{"name":"CosmosOperationCanceledException","duration in milliseconds":0.0118,"data":{"Operation Cancelled Exception":"System.OperationCanceledException: The operation was canceled.\r\n at System.Threading.CancellationToken.ThrowOperationCanceledException()\r\n at Microsoft.Azure.Cosmos.CosmosHttpClientCore.SendHttpHelperAsync(Func
1 createRequestMessageAsync, ResourceType resourceType, HttpTimeoutPolicy timeoutPolicy, IClientSideRequestStatistics clientSideRequestStatistics, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.GatewayStoreClient.InvokeAsync(DocumentServiceRequest request, ResourceType resourceType, Uri physicalAddress, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.GatewayStoreModel.ProcessMessageAsync(DocumentServiceRequest request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.TransportHandler.ProcessMessageAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.TransportHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.RouterHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.ExecuteHttpRequestAsync(Func1 callbackMethod, Func
3 callShouldRetry, Func3 callShouldRetryException, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.AbstractRetryHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.RequestHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(RequestMessage request, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.Handlers.RequestInvokerHandler.SendAsync(String resourceUriString, ResourceType resourceType, OperationType operationType, RequestOptions requestOptions, ContainerInternal cosmosContainerCore, FeedRange feedRange, Stream streamPayload, Action
1 requestEnricher, ITrace trace, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.ContainerCore.ReadContainerAsync(ITrace trace, ContainerRequestOptions requestOptions, CancellationToken cancellationToken)\r\n at Microsoft.Azure.Cosmos.ClientContextCore.RunWithDiagnosticsHelperAsync[TResult](String containerName, String databaseName, OperationType operationType, ITrace trace, Func2 task, Func
2 openTelemetry, String operationName, RequestOptions requestOptions)"}}]}
@JohnDStrasz can you confirm whether #122 fixes this for you? I included the sample you provided in my testing and confirmed the change during export but it would be good to make sure there's not something else you're seeing.
I found some more time to test and I debugged into the code. All Cosmos transfers fail, regardless of whether they have the $type property or not, in the UpsertItemAsync line of the PopulateItem method in the CosmosDataSinkExtension.cs file
case DataWriteMode.Upsert:
var upsertResponse = await container.UpsertItemAsync(item, cancellationToken: cancellationToken);
statusCode = upsertResponse.StatusCode;
break;
I am compiling in Debug mode.
@JohnDStrasz Another recent change caused a conflict in the serializer settings which is now fixed so this should be back to working for round trip transfers to and from Cosmos.
@JohnDStrasz Another recent change caused a conflict in the serializer settings which is now fixed so this should be back to working for round trip transfers to and from Cosmos.
It works!!! I compared a doc between the source and destination and it was exact, except for the usual Cosmos internal fields.
This is huge for us John. Thank you so much. Looking forward to it being merged to master soon.
Fixes are now included in Release 2.1.5. Thank you @JohnDStrasz for help reproducing and validating!
Fixes are now included in Release 2.1.5. Thank you @JohnDStrasz for help reproducing and validating!
Anytime!
Hey, First of all, thank you for this amazing tool. I was successfully able to migrate data directly from and to Cosmos DB.
Though, for some reason, the property named
$type
was not copied with the rest of the data and was silently omitted. This value is used by the .NET's JSON library Newtonsoft.Json for complex data type handling. The omitted field led to changed data deserialization behavior, which in turn led to issues for my applications.To clarify, this:
was changed to this:
My guess would be that either this is the case for all fields with an
$
in it, or, since the tool itself seems to use Newtonsoft.Json, this usage inadvertently stops propagating that field. I would assume this is unintentional behavior, i.e. a bug. Though if it's indeed intended behavior, then I would suggest outputting a warning of some kind if this is detected as well as adding a note to the documentation.For my testing, I used version 2.1.4 on Mac OS X Ventura 13.0.