ebeshero / DHClass-Hub

a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.
GNU Affero General Public License v3.0
27 stars 27 forks source link

XPath: Taking Average of values derived by a FLOWR statement #783

Closed bryant-bolyen closed 4 years ago

bryant-bolyen commented 4 years ago

I'm trying to get a count of specific XML entities across five directories and output the average value for each directory.

My code can be found on eXist at: /db/bab184/projectTesting

We step through a loop to retrieve elements from <album> elements in the following sibling tree, then a second loop to retrieve a count of children in the <song> element that match both previous element. We'll output 58 of these count values, and I'd like to be able operate on them within the program.</p> <p>Lines 18 - 32 are the most important (seen here):</p> <pre><code>(: Initialize variable for a cumulative sum of count in the next structure:) let $partAddend := number('0') (: Target count from title and album @refs :) for $titleIndex in $titleDist let $partRef := $discog//song[preceding::album/@ref = $albumIndex][preceding::title/@ref = $titleIndex]/(child::intro | prelude | interlude| postlude | verse | chorus | bridge | outro | preChorus | instrumental) ! name() let $partCount := $partRef=> count() (:Now that we have the values, is it possible to take the average of the collection? The avg() function won't return values we want because it operates on the current index of the loop and would be redundant. Below I have an algorithm for getting a running total, but its output renders it redundant as well. It seems as though partAddend isn't staying initialized to the sum after executing and may be returning to zero, but albumIndex shouldn't iterate until titleIndex has finished a full revolution? Maybe there's a bug in my loops that could also be causing this or maybe it's xpath or my algorithm or I'm just being silly.:) let $partSum := sum(($partCount, $partAddend)) let $partAddend := $partSum return $partSum</code></pre> <p>P.S. don't judge that ugly drop into the child axis it works ok</p> <p>@ebeshero @alnopa9 @frabbitry @BMT45 </p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ebeshero"><img src="https://avatars.githubusercontent.com/u/4014518?v=4" />ebeshero</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@Bryant-LettucePrime I'm taking a look right now before the Pittsburgh DH presentations at 10...Before looking at your code, though, I will say that <code>avg()</code> will do what you ask as long as you can output the <em>entire</em> sequence of all the values across your five directories (your collection of collections). </p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/bryant-bolyen"><img src="https://avatars.githubusercontent.com/u/59665464?v=4" />bryant-bolyen</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>Yeah, if I knew how to step <em>out</em> of a for loop it would make this trivial.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ebeshero"><img src="https://avatars.githubusercontent.com/u/4014518?v=4" />ebeshero</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@Bryant-LettucePrime Preliminary thoughts, looking at your code. Do I understand this correctly? You're looking for a count of all the elements marked song by song (in your loop over the titles), and then you want to determine the average of those counts <em>per album</em>, right? </p> <p>If I'm right about this, you'll want to store some values in variables. For each album, you need an average of the sequence of counts per each song. So, first of all, you need a variable that will collect a sequence of all the counts coming in for each song. For this, I think we'll do something kind of meta: We'll want to define a variable that stores the return of a FLWOR. </p> <p>Once you have that that sequence, in a for-loop over the albums, you'll be able to apply <code>avg</code> to return 5 average values (one per each album). This is a little advanced, and I'm happy to help you with it...It'll have to wait until the noon hour, though as I'm tuning into the Pgh presentations now and have a class at 11.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/bryant-bolyen"><img src="https://avatars.githubusercontent.com/u/59665464?v=4" />bryant-bolyen</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <blockquote> <p>@Bryant-LettucePrime Preliminary thoughts, looking at your code. Do I understand this correctly? You're looking for a count of all the elements marked song by song (in your loop over the titles), and then you want to determine the average of those counts <em>per album</em>, right?</p> <p>If I'm right about this, you'll want to store some values in variables. For each album, you need an average of the sequence of counts per each song. So, first of all, you need a variable that will collect a sequence of all the counts coming in for each song. For this, I think we'll do something kind of meta: We'll want to define a variable that stores the return of a FLWOR.</p> <p>Once you have that that sequence, in a for-loop over the albums, you'll be able to apply <code>avg</code> to return 5 average values (one per each album). This is a little advanced, and I'm happy to help you with it...It'll have to wait until the noon hour, though as I'm tuning into the Pgh presentations now and have a class at 11.</p> </blockquote> <p>Sounds good! Thanks!</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ebeshero"><img src="https://avatars.githubusercontent.com/u/4014518?v=4" />ebeshero</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@Bryant-LettucePrime Got you started, I think! I saved this is in a new file instead of writing overtop of yours: It's here in the database: <code>/db//bab184/projectTesting-ebb.xql</code></p> <p>But here is the whole file. Notice the variable that is set up in your for-loop over the 5 albums: That one variable, <code>$albumCounts</code>, contains the return of a FLWOR:</p> <pre><code>xquery version "3.1"; (:The goal of this XPath is to count XML elements from multiple directories and output the average value from each directory:) let $discog := collection('/db/brandnew/XML/Albums') let $albumRef := $discog//album/@ref/string() ! normalize-space() let $albumDist := $albumRef => distinct-values() (: Target title @ref from album @ref :) for $albumIndex in $albumDist let $titleRef := $discog//title[following-sibling::album/@ref = $albumIndex]/@ref/string() ! normalize-space() let $titleDist := $titleRef => distinct-values() (: Put this all in the right order :) order by $albumIndex (: Initialize variable for a cumulative sum of count in the next structure:) let $partAddend := number('0') let $albumCounts := (: Target count from title and album @refs :) for $titleIndex in $titleDist let $partRef := $discog//song[preceding::album/@ref = $albumIndex][preceding::title/@ref = $titleIndex]/(child::intro | prelude | interlude| postlude | verse | chorus | bridge | outro | preChorus | instrumental) ! name() let $partCount := $partRef=> count() (:Now that we have the values, is it possible to take the average of the collection? The avg() function won't return values we want because it operates on the current index of the loop and would be redundant. Below I have an algorithm for getting a running total, but its output renders it redundant as well. It seems as though partAddend isn't staying initialized to the sum and may be returning to zero, but albumIndex shouldn't be iterating until titleIndex has finished a full revolution? Maybe there's a bug in my loops that could also be causing this.:) let $partSum := sum(($partCount, $partAddend)) let $partAddend := $partSum return $partSum return concat($albumIndex, ': ', (string-join($albumCounts, ', ')))</code></pre> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ebeshero"><img src="https://avatars.githubusercontent.com/u/4014518?v=4" />ebeshero</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@Bryant-LettucePrime Run this to view the return. And I leave it to you to apply the <code>avg()</code> function at this point. But I think this gets you what you need...</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/bryant-bolyen"><img src="https://avatars.githubusercontent.com/u/59665464?v=4" />bryant-bolyen</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@ebeshero AHA! So declaring complex variables like this allows a programmer to nest results, terminating a for loop and iterate on the output. This was the missing piece in the puzzle. I'll do some shenaniganry to make an SVG we can use.</p> <p>Thanks!</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ebeshero"><img src="https://avatars.githubusercontent.com/u/4014518?v=4" />ebeshero</a> commented <strong> 4 years ago</strong> </div> <div class="markdown-body"> <p>@Bryant-LettucePrime Yes indeed! Happy graphing! :-D</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>