KiaraGrouwstra / pquery

A collection of functions for use in Excel Power Query
MIT License
233 stars 76 forks source link

Chaining M folder functions #7

Closed MortalCatalyst closed 8 years ago

MortalCatalyst commented 8 years ago

Hi

I want to be able to chain 4 functions together that will be used to import xml files from a folder. It is the same file each query on a table in the file.

Which function in the repository is best for chaining the execution of M functions.

This is my thread where i was referred here as s last possible chance to get it working https://social.technet.microsoft.com/Forums/en-US/1d31402b-b05a-4091-8ce9-930481c34f93/combine-4-xml-queries-on-same-source-to-be-a-global-folder-query?forum=powerquery

KiaraGrouwstra commented 8 years ago

I'm not actually sure you'd need any of my functions here.

To parameterize your queries so you could use them on multiple files, make them into functions. i.e. from: let Source = Xml.Tables(File.Contents("C:\Users\Sayth\OneDrive\Racing\20160305RAND0.xml")), to: (FileName) => let Source = Xml.Tables(File.Contents(FileName)),

This yields a reusable function that you can then reuse with: functionName("C:\Users\Sayth\OneDrive\Racing\20160305RAND0.xml")

Ehren in that thread was correct; for the purpose of getting the info into tables for the Data Model, one query would be expected to output one table. This isn't to say one query cannot do all of this processing -- it can. Once you have reusable functions for each of them, you could have one function do all four:

(FileName) => [FirstResult=functionName(FileName), ...]

The problem is though, since PowerPivot will want separate tables, if you're doing this you'd again need separate queries to separate these results back. Which begs the question, why even bother in the first place.

Which brings me back to your aversion to multiple queries, as Ehren did. You hadn't responded to her anymore -- I presume the refreshing isn't an issue?

Feel free to clarify further.

MortalCatalyst commented 8 years ago

Refreshing is not an issue. I was of the understanding that Ehren was unable to assist and had suggested use of these libraries as a chance to resolve.

Thank you for taking the time to respond, very much appreciated.

I am open to suggestion and not averse to any methodology as relatively new to powerquery although not to Excel.

In my assumption and reading your post above I think you have solved the problem largely. So if I have a folder of 50 xml files and I process the queries as

(FileName) => [FirstResult=functionName(FileName), ...]

I would end up with 4 tables all filled with the respective data from the 50 files and as long as the relationship in my data model are correct then it will work?

KiaraGrouwstra commented 8 years ago

[FirstResult=functionName(FileName), ...] is a record in M. So that's a function returning a record of 4 results (such as tables) for a single filename. However, PowerPivot takes tables for input (one per query), not records.

One approach: have that function, iterate over it for all files in a new query, so as to obtain a List of Records (or, a Table), then have four more queries pick out those respective parts out (merging results per file into one and the same table) for separate consumption by PowerPivot.

Another approach: for each of your four result types, make a query iterating over your files and extracting one piece of info, optionally splitting out the extraction logic into a resuable function.

Ultimately, approach #1 may not justify the complication it adds, which involves (if you must):

If you're interested, I'd recommend using the language reference and formula reference for looking things up, but feel free to follow up with further questions.

MortalCatalyst commented 8 years ago

Thank you for the references. Not sure which route i will take, however sure to learn either way.

MortalCatalyst commented 8 years ago

Can I ask for help, Ehren and i cannot solve the question https://social.technet.microsoft.com/Forums/en-US/1d31402b-b05a-4091-8ce9-930481c34f93/combine-4-xml-queries-on-same-source-to-be-a-global-folder-query?forum=powerquery

I chained the functions as you discussed above however I am unable to run the query on a folder.

let ExcelFile = (FilePath, FileName) => let Source = Folder.Files(FilePath), Query1 = (FileName) => [FirstResult=Meeting(FileName), SecondResult=Race(FileName), ThirdResult=Reference(FileName), FourthResult=Horses(FileName)] in ExcelFile

Or

let xmlImport = (FilePath, FileName) => let Source = Folder.Files(FilePath), Query1 = (FileName) => [FirstResult=Meeting(FileName), SecondResult=Race(FileName), ThirdResult=Reference(FileName), FourthResult=Horses(FileName)], content = Binary.Combine(Source[Query1]), xmlImport = Xml.Tables(content) in xmlImport

Seems as though I thought what I was doing was a good simple starting project hasn't worked that way.

To clarify as there are multiple tables in my XML they cannot be resolved in one query so I create a query for each one. Now I wanted to chain them together so that they would run at once on a folder of XML files and return the results to a sheet for each query.

KiaraGrouwstra commented 8 years ago

As mentioned above Ehren is correct the data model wants functions to output one table. This seems largely about the basics of M syntax; you may do well to check that language reference to get more of a grasp of what's happening. Specifically:

MortalCatalyst commented 8 years ago

I have been to the reference online, can i ask you straight though i thought this task simple in scope is it beyond M and PowerQuery current capability?

Seems if i did this i would be the first based on internet searches, the msin similar example being xsv which is much easier.

MortalCatalyst commented 8 years ago

I have no desire for 4 just seems most logical way to handle related data tables

KiaraGrouwstra commented 8 years ago

It's not beyond it; use Table.Column() and List.Transform(). But you're taking the hard approach; the easy one is just to iterate over your folder from separate queries, not combining the functions. If you'd be comfortable with the language that approach is fine (I've used it), but until then, it's just adding unnecessary complexity.

MortalCatalyst commented 8 years ago

Thanks