Azure / usql

U-SQL Examples and Issue Tracking
http://usql.io
MIT License
234 stars 683 forks source link

Cannot extract json containing an array as start element #118

Open kurtng opened 6 years ago

kurtng commented 6 years ago

JsonExtractor cannot handle if json contains an array as top most object. Following does not work

input json: [{'groupId':'g1'},{'groupId':'g2'}] extractor with path: new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("$[*]")

MikeRys commented 6 years ago

Thanks for reporting this. I think this may have been broken with an earlier check in to make the handling of large top-level documents more streaming and less DOM-based. I will take a look while time permits.

fe-rod commented 6 years ago

@kurtng we are having the same issue.

Thing is that on the file JsonExtractor.cs (line 57), there is the following line:

if (reader.TokenType == JsonToken.StartObject)

For what I understand, the first reader.Read() will be a JsonToken.StartArray so it will skip the first read and the path won't work, instead the path will be applied in the second iteration when the SelectChildren(token, this.rowpath) is called.

We are trying to make some changes but still haven't found a nice generic solution.

I'll let you know if we have some news.