Closed Samantha-Mcguigan closed 7 years ago
@Samantha-Mcguigan The problem is likely right here:
if ($people[. = $distinctPeople]/ancestor::div[1] = $p)
Think about what that is literally doing...the computer is certainly really stumped here...
@Samantha-Mcguigan Think about what you need to checking. I think you want to walk down the tree, and find where a particular node matches your current $p (which is a member of the distinct-values list and not on the tree).
It looks like you wanted to use $people
to be your "tree walker" variable, but notice that that variable stops on a tokenized string--which means it, too, steps OFF the tree to yield a little string of text. You want to find where your $p is equal to that little tokenized string of text, so you can explain that equivalence in a predicate expression. But you need to set up your "tree walker" a little differently from the $people variable. Does that make sense?
@Samantha-Mcguigan Also, I am scratching my head about your $people
variable, which I think really doesn't need a tokenize function on it (does it?) Why you are trying to tokenize the element content of <persName>
there? When I take distinct-values of all the persNames in that file (a total of 83 distinct names), I do not see any that have white spaces in them. When we use the persName element to surround a name, it might contain a first and a last name, but the white space would simply be part of the name string. I think you might be trying to apply what we did with the Hamilton project's attribute values to your code here--and the two projects aren't coded the same way.
The Decameron project's code didn't develop a personography as far as I remember, so they were not making single string @ref
attributes like the Hamilton project is doing. When we have a plural list of attributes, we separate those with white spaces, but that is not something you need to work with for the Decameron code, is it?
I think I'm starting to understand. I have this now:
xquery version "3.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
declare variable $decameron := doc('/db/decameron/engDecameronTEI.xml');
declare variable $people := $decameron//persName;
declare variable $distinctPeople := distinct-values($people);
for $p in $distinctPeople
let $peers:=
if (div[1]//persName = $p)
then distinct-values(div[1]//persName)
else if (floatingText//persName = $p)
then distinct-values(floatingText//persName)
else (distinct-values($people))
I took out all the tokenize functions. Also I know I have to walk down the tree to get the persNames based on the div they are sitting in right? but I am getting an error on if (div[1]//persName = $p)
It doesn't like the div[1]
but I don't know how to fix that.
@Samantha-Mcguigan Good, this is clearer and simpler now! :-) Okay, so with the Decameron project, you want to reach back to the first <div>
that is the ancestor of the current persName you're on in the tree. So you want to look up the ancestor axis just one stop.
@Samantha-Mcguigan The idea here is to find out which KIND of div that persName is sitting inside...let me check what I wrote on the assignment sheet about this to see if I can explain it more clearly.
@Samantha-Mcguigan Here's the relevant part of the assignment sheet: "In our example from The Decameron we output three different words to indicate whether an interaction occurred in floatingText, in the outer frame around the stories, or inside the stories themselves."
This is to describe the kind of interaction we are seeing (a bridge or edge connection). Decameron is a layered narrative, which looks like this structurally: http://decameron.newtfire.org/boxModel.html So we want to see at what narrative level our persName elements occur if you are following our example in the assignment. To get a sense of this, try doing some exploratory document analysis: run some queries on persNames and look at the first ancestor elements: what do you see?
I'm getting results now! I have:
xquery version "3.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
declare variable $decameron := doc('/db/decameron/engDecameronTEI.xml');
declare variable $people := $decameron//persName;
declare variable $distinctPeople := distinct-values($people);
for $p in $distinctPeople
let $peers:=
if (//persName[parent::div] = $p)
then distinct-values(//div[1]//persName)
else if (//floatingText//persName = $p)
then distinct-values(//floatingText//persName)
else (distinct-values($people))
let $edgeType:=
if (//div[1])
then "novella"
else if (//floatingText)
then "floatingText"
else "frame"
for $peer in $peers
return
concat($p, "	", $edgeType, "	",$peer, " ")
I had a couple missing //
and the computer was mad at me. I am getting 5509 results. Is that too many or does that seem right?
@Samantha-Mcguigan The conditional statement for floatingText
looks right. I'm realizing this is hard b/c you're not actually working on this project! (It was "hot" like the Hamilton project last spring, but now it's gone dormant a bit and we're not as familiar with it...) So here are some things to be aware of:
You aren't looking for immediate parent:: elements for your persNames (because those are paragraphs or quotes). What you want is actually to go hunting to see:
1) whether there is a <floatingText>
ancestor: If there is, it's in one of those nifty nested stories-within-a-story. (stories coded within <floatingText>
are the most deeply nested of stories.)
2) otherwise it's going to be framed by a <div>
element with an @type
attribute that indicates the level of the story you're in. So if it does not have a floatingText ancestor, check the type attributes on the first ancestor div. (You don't want any other ancestor divs, because every story is nested in a novella, and that novella is nested in a frame, all the way back up to the div that surrounds the entire document!)
Does that make sense?
@Samantha-Mcguigan This is probably causing too much output:
let $edgeType:=
if (//div[1])
then "novella"
You're looking down the tree from the document node here. Instead, to get the current shared context, you need to look up the tree from the point of view of a persName on your tree that equals the current $p in your for-loop. Look up at the first ancestor, and if it is floatingText, output "floating text, and
otherwise, I think you can output the @type
on its first ancestor <div>
!
To get the peers, you want to look up from $p, stand on the context (if ancestor::floatingText, then that, or else its ancestor::div[1]), then look down and collect all the persName elements that are NOT EQUAL to the current $p (using ne
or !=
)
okay i have:
for $p in $distinctPeople
let $peers:=
if (//floatingText//persName = $p)
then distinct-values(//floatingText//persName)
else if (//persName = $p/ancestor::div)
then distinct-values(//persName/ancestor::div)
else (distinct-values($people))
let $edgeType:=
if (//persName=$p/parent::floatingText)
then "floatingText"
else if (//persname=$p/ancestor::div)
then div/@type
else "frame"
for $peer in $peers
return
concat($p, "	", $edgeType, "	",$peer, " ")
but I'm getting an error that says cannot convert xs:untypedAtomic ('Fiammetta') to a node set
@Samantha-Mcguigan The peers are problematic: (See my post just above this.) You are currently getting ALL the persName elements in EVERY floating text (not the floating text that contains $p). Also, you would be returning all the persNames including the one that matches $p. You need to exclude $p--that is, get all the persNames that DO NOT EQUAL (ne or !=) $p.
@Samantha-Mcguigan This construction you're using might be causing a problem, but I'm not sure b/c I'm not in a place to test it right now:
//persName=$p/parent::floatingTExt
First, I'd make that ancestor::floatingText (There would only be one of these, and might be in an ancestor relationship.) More significantly, you should probably set up a predicate filter to catch
//persName[. = $p]//ancestor::floatingText
Does that make a difference?
i have:
xquery version "3.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
declare variable $decameron := doc('/db/decameron/engDecameronTEI.xml');
declare variable $people := $decameron//persName;
declare variable $distinctPeople := distinct-values($people);
for $p in $distinctPeople
let $peers:=
if (//persName [. = $p]//ancestor::floatingText)
then distinct-values(//persName[. ne $p]//ancestor::floatingText)
else if (//persName [.= $p]//ancestor::div[1])
then distinct-values(//persName[. ne $p]//ancestor::div[1])
else (distinct-values($people))
let $edgeType:=
if (//persName=$p//ancestor::floatingText)
then "floatingText"
else if (//persname=$p//ancestor::div[1])
then "novella"
else "frame"
for $peer in $peers
return
concat($p, "	", $edgeType, "	",$peer, " ")
I'm still getting the same error
I'm sorry if I keep repeating the same mistakes you already told me how to fix, I'm just having a hard time understanding
@Samantha-Mcguigan Sorry--got caught up in a meeting! Here are a couple of issues now:
1) your $peers
variable is not returning people's names. That is because your XPath is landing on the floatingText element, or would be--there's an odd double slash before you walk up the ancestor axis. You basically just want to return the descendants of that same floatingText element that holds $p, who are NOT $p. Try writing it like this:
if (//persName [. = $p]//ancestor::floatingText)
then distinct-values(//persName[. = $p]/ancestor::floatingText//persName[. != $p])
Read this carefully: I say, if the <persName>
that's $p
has a <floatingText>
ancestor, then, go to that <persName>
that's <$p>
, go up to its ancestor <floatingText>
, then go back down to get all the other <persName>
elements that are NOT equal to <$p>
. (And wrap it in distinct-values()
so we remove duplicates from the list.) Notice I'm using the value-comparison operator (!=
) on that last step because there's just one $p
and multiple peers in that section of text.
You need to apply something like that XPath above to each one of your conditional statements that define the $peers
variable. Let's start there.
$edgeType
, I think you're not understanding how the Decameron is coded, and that may be a problem for understanding what level of the text you're testing for. Try downloading their TEI file and opening it on oXygen and studying it for a bit. All the text is set in nested <div>
elements, and the way you know what level of the text you're in is by looking at the @type
on the <div>
element. Try maybe a separate XQuery or opening the TEI file in oXygen to run some XPath over it (or look at it in outline view in oXygen) so you can see this. Every <persName>
in the Decameron text will have an ancestor <div>
somewhere, so you probably won't get anything that is "frame" with this conditional setup. What is the XPath you need to determine if something is in a frame? (Check the values of the @type
attributes...or maybe just have your conditional return the @type
attribute. Maybe you only need two conditions here, not three... Notice that you haven't modified your $edgeType conditional statements to have predicates that test if [. = $p]
the way you did on the $peers
variable. (My hunch is that this is causing that error you're seeing about a node not being equivalent to a xs:untypedAtomic...)
@Samantha-Mcguigan About $edgeType: I just did an XPath search on the Decameron TEI file (outside of this XQuery--just did it on the file in oXygen). I went looking for this:
//persName/ancestor::div[1]/@type
That looks down at every persName in the file, and then walks up to its first <div>
ancestor, and gets the value of its @type
attribute.
And then I wrapped that in distinct-values()
:
distinct-values(//persName/ancestor::div[1]/@type)
I returned 6 different @type
values on the div elements. They are:
prologue
Day
introduction
novella
conclusion
epilogue
Now, if in the network analysis I just want to distinguish the main storyline from the various frames, I probably just want to test to see if a <div>
is @type="novella"
or not. But I could just differentiate among all these divs. Here's what they mean:
OUTER FRAME= prologue, introduction, conclusion, and epilogue
DAY FRAMES= Day
STORIES TOLD BY CHARACTERS= novella
STORIES-WITHIN-STORIES = down deep, within the novellas, are some stories told by characters within the novellas, and those are layered inside the <floatingText>
element.
Your network analysis might try to see what at levels of text the various characters are connected together, and it should look really interesting! I think you're on the right track but just needed an overview of how the project was coded. I also think you should set up the conditional statements to return the kinds of relationships that make sense to you to visualize. I'd recommend summing them up as three different kinds of relationships:
1) connections made in the outer frame + connections made within the Day frames around each story (that would all make sense as "frame")
2) connections made within the novellas (within the stories told by the characters introduced in the frame).
3) connections made inside the <floatingText>
stories-within-the-stories.
When uploading to CourseWeb, make sure you put your two files into a zipped folder or else it will not upload. @Samantha-Mcguigan @ahunker @bsf15 @jonhoranic
@ebeshero I just got back from rehearsal but I think I got it so I going to turn it in. Thank you for all your help, I really appreciate it and sorry again if I was being a pain and not understanding something the first time you said it.
@Samantha-Mcguigan You weren't being a pain! I'm glad you figured it out. It's hard working with another team's project files without being on the inside of their code...and I had to remember how they were coding, too!
Just as I was going to download my output to put into cytoscape, I got a sever error on the existDB page. Trying to find out what was wrong, I noticed that my internet connection was disconnected. When I looked into it further? I found ENTIRE HOUSE is down, I'm scambling to try to reset things but hopefully I can get at least half turned in for class and finish it up with debugging. Thankfully using my LTE on my phone I could send this message out, I will report back once I get the Internet connection back.
I have a semi stable connection now, I'll post these files to course web ASAP. I will be working on the connection issues at a later point.
This is what I have so far and it won't let me eval anything:
I am sure it has something to do with how I have set up my XPath expressions but I'm not sure what could be the problem with them.