Azure / usql

U-SQL Examples and Issue Tracking
http://usql.io
MIT License
234 stars 683 forks source link

xmlextractor xml attribute support? #75

Open AlexKeySmith opened 7 years ago

AlexKeySmith commented 7 years ago

Hi,

I wasn't sure if the example XmlExtractor supported attributes? This line made me think it possibly did, but I haven't had any luck reading attributes in xml.

state.ElementWriter.WriteAttributes(reader, false);

-thanks Alex.

MikeRys commented 7 years ago

Hi Alex: how did you try to refer to the attributes? Note that XPath uses @name as the way to refer to attributes.

AlexKeySmith commented 7 years ago

Hi Mike, thanks for the message, I get the sense that it's perhaps supported somehow but it wasn't in the example documentation so wasn't sure if it was a supported option in the code as it currently stands?

Is it perhaps a mixture of the XmlDomExtractor and the XPath static class?

AlexKeySmith commented 7 years ago

Hold that thought, putting together an example now (hopefully)... it feels more obvious now I get the sense it's likely supported..... Perhaps I'll even put in a pull request for the readme once I get something together :-)

houngj commented 6 years ago

@AlexKeySmith Did you ever get around to putting together an example for accessing xml attributes? I am working on transforming some XML data and I also need to reference an xml attribute through the XmlExtractor.

AlexKeySmith commented 6 years ago

Hi @houngj yes we did via a couple of different approaches.

I'm trying to remember of the top my head of the first approach, as it's sitting in a long ago deleted experimental git branch.

  1. I first gave it a shot using the example extractor in this report. Accessing attributes using the @ as Mike suggested worked well, but sorry I don't have to hand the example code, I haven't had much luck yet finding the deleted branch.

  2. We had some spare Microsoft Premium support hours and a couple of talented chaps wrote us a shiny new extractor which suited our particular heavily nested XML nicely, unfortunately though that code base isn't open sourced :-(

In the end; our source data has changed to CSV, so we haven't spent to much more time on it I'm afraid. CSV being split-able is more easily optimized by u-sql.

If you get really stuck let me know and I'll try and find the attribute example code in our repo. Or even chat to the powers that be and open source the custom XML extractor (might be easier said than done).