Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at or our versioned developer docs at
MIT License
5.25k stars 4.59k forks source link

[BUG] Calling datafactoryResource.GetDataFactoryDatasetAsync(datasetName)) on a dataset triggers desirialization exception #45005

Open ClementVaillantCodit opened 2 months ago

ClementVaillantCodit commented 2 months ago

Library name and version

Azure.ResourceManager.DataFactory 1.1.0

Describe the bug

I am using the SDK to be able to test data flows in Azure Data Factory programmatically. To do that, we create a data flow debug session, create a data flow and all associated resources (linked services, datasets) and use _datafactoryResource.AddDataFlowToDebugSessionAsync(dataFactoryDataFlowDebugPackageContent); to add the dataflow to the debug session created for this purpose.

However when trying to loop through existing datasets (source and sink datasets) and add them to a list of DataFactoryDatasetDebugInfo, I am getting exceptions when calling: var datasetData = (await _datafactoryResource.GetDataFactoryDatasetAsync(source.Dataset.ReferenceName)).Value.Data; on all datasets that do not have a schema defined in Azure Data Factory.

As far as I know, defining a schema for datasets is not mandatory in Data Factory. At the moment there are no workarounds using the SDK.

I suppose the issue is the schema definition as I created a simple test dataflow with just a source and sink, and as soon as I have added a schema to my sink which previously wasn't there, all calls to _datafactoryResource.GetDataFactoryDatasetAsync() succeeded.

Expected behavior

var datasetData = (await _datafactoryResource.GetDataFactoryDatasetAsync(source.Dataset.ReferenceName)).Value.Data; must not fail with exception "Cannot deserialize an Object as a list." when dataset does not have a schema defined.

Actual behavior

var datasetData = (await _datafactoryResource.GetDataFactoryDatasetAsync(source.Dataset.ReferenceName)).Value.Data; fails with exception "Cannot deserialize an Object as a list." when dataset does not have a schema defined.

Message: System.InvalidOperationException: Cannot deserialize an Object as a list.

Stack Trace:  DataFactoryElementJsonConverter.DeserializeGenericList[T](JsonElement json) InvokeStub_DataFactoryElementJsonConverter.DeserializeGenericList(Object, Span`1) MethodBaseInvoker.InvokeWithOneArg(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

Reproduction Steps

  1. Create a Data flow in DataFactory, with a source and a sink.
  2. Define a schema on the source dataset, and do not define a schema in the sink dataset (note that the other way around works as well).
  3. Call var datasetData = (await _datafactoryResource.GetDataFactoryDatasetAsync(source.Dataset.ReferenceName)).Value.Data; programmatically will fail with exception "Cannot deserialize an Object as a list.".

Note that when creating a dataset and getting it programmatically, the issue does not occur. The following is executed successfully: where _client is a DataFactoryResource

var linkedServiceName = "TestLinkedService_1";

var azureStorageLinkedService = new AzureStorageLinkedService
    ConnectionString = StorageConnectionString
var linkedServiceData = new DataFactoryLinkedServiceData(azureStorageLinkedService);

await _client.GetDataFactoryLinkedServices().CreateOrUpdateAsync(Azure.WaitUntil.Completed, linkedServiceName, linkedServiceData);

var jsonDataset = new JsonDataset(
    new DataFactoryLinkedServiceReference(DataFactoryLinkedServiceReferenceKind.LinkedServiceReference, linkedServiceName)
var jsonDatasetData = new DataFactoryDatasetData(jsonDataset);
var jsonDataFactoryDatasetResource = (await _client.GetDataFactoryDatasets().CreateOrUpdateAsync(Azure.WaitUntil.Completed, "TestJsonDataset", jsonDatasetData)).Value;

var jsonDatasetResponse = (await _client.GetDataFactoryDatasetAsync("TestJsonDataset")).Value.Data;


.NET SDK: Version: 8.0.303

Runtime Environment: OS Name: Windows OS Version: 10.0.22631 OS Platform: Windows

.NET 6 and .NET 8 IDE and version: Visual Studio 17.10.4

github-actions[bot] commented 2 months ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

stijnmoreels commented 1 month ago
