Open wdcossey opened 9 months ago
Could you share the error returned from ADX?
Hi @AdrianStrugala
Here's part of the error:
Couldn't infer file schema. Error: Input (format: 'Avro') source cannot be read due to: 'Unrecognized Avro schema: '{"type":"array","items"
Give me a bit and I will send a code sample (with a full error).
Full error:
Couldn't infer file schema. Error: Input (format: 'Avro') source cannot be read due to: 'Unrecognized Avro schema: '.
{"type":"array","items":{"name":"SomeClass","namespace":"BogusData","type":"record","fields":[{"name":"Int32","type":"int"},{"name":"String","type":"string"},{"name":"Datetime","type":{"type":"long","logicalType":"timestamp-micros"}},{"name":"Decimal","type":{"type":"bytes","logicalType":"decimal","precision":28,"scale":18}}]}}
Test code:
var seqNum = 123;
var fakeItems = new Faker<SomeClass>()
.CustomInstantiator(f => new SomeClass(seqNum++))
.RuleFor(o => o.Decimal, f => f.Random.Decimal(1.1m, 999m))
.RuleFor(o => o.Datetime, f => f.Date.Recent())
.RuleFor(o => o.String, f => f.Random.String(4, 4)).Generate(10);
var result = AvroConvert.Serialize(fakeItems, CodecType.Null);
await File.WriteAllBytesAsync("somefilename.avro", result);
Class:
public class SomeClass
{
public int Int32 { get; }
public string String { get; set; }
public DateTime Datetime { get; set; }
[AvroDecimal(Precision = 28, Scale = 18)]
public decimal Decimal { get; set; }
public SomeClass(int seqNum)
{
Int32 = seqNum;
}
}
The schema is valid according to Avro specification. It looks like the problem is on ADX side. Could you report that to MS? I will take another look anyway, but I can't promise anything.
@AdrianStrugala I have resolved the issue, I will open a PR tomorrow [after I have done some additional testing].
You can have a look at the PR and approve it or use what is there to make a solution more to your liking.
The proposed solution will be a default behavior for V4.
I'm having issues with Azure ADX.
I would like to write data (1000s, 100000s records) to a file that will be pushed to Azure ADX for processing.
Whilst the library is simple enough to get working, Azure does not like the schema that is given.
I simply tried serializing a list of objects (say 2-10) and wrote that to a file and uploaded it, it failed.
Writing a single object does however work.
I also tried Merge, that also failed to work.
My understanding is that ADX doesn't understand the array data type.
Is what I'm attempting to do even possible? I wouldn't want to send 10,000 individual files.