ebeshero / DHClass-Hub

a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.

GNU Affero General Public License v3.0

27 stars 27 forks source link

Traversing the Tree and Overlapping Hierarchies #666

Closed ebeshero closed 5 years ago

ebeshero commented 5 years ago

Here is our first Discussion assignment for the semester:

The reading:

Read Gabrielle Kirilloff, “<Traversing_the_Tree/>”
Check out the (very short) article, “Frankenstein novel analyzed” and scroll through Wendell Piez's conference talk and images for the Balisage Markup Conference 2014. If you like, you can take a close look at his LMNL code of Frankenstein on GitHub. Special note of interest: Gabi Kirilloff was a student in a digital humanities course at Pitt like the one you are taking, and she originally wrote <Traversing_the_Tree/> for a seminar paper assignment in another class.
The discussion prompts:
What perspective does Kirilloff provide on the kinds of XML markup we are learning, the history and context of hierarchical markup?
What problems does hierarchical markup pose for encoding documents?
- What ideas do Kirilloff or Piez present for how to deal with these in code, and how effective or problematic might these be?
- Is it possible to write XML to "get around" the problems raised in these pieces? What's lost (or gained) in making XML's hierarchical structure deal with overlap?
Consider the examples of overlapping hierarchies that Kirilloff and Piez present to us: Which of these did you find especially interesting? Are there good ways to "model" overlapping hierarchies with code?

The discussion is worth credit as a homework exercise. Your post should make specific reference to passages in Kirilloff's essay, and reflect on those passages. You should make at least two substantial posts to fully contribute to the discussion. Note: You (as an individual) do not have to respond to every one of the discussion prompts, but our class as a whole should cover them all. You might want to reply to at least two of the prompts in the list above. Raising questions is encouraged, and so is responding to each other, but responding should do more than simply say, "yes, I agree." A good response will add something new to the conversation, or help promote more discussion.

As you're drafting your comments, see if you can apply "Markdown" formatting if you'd like to use bold or italics or make a list, form a link, add an image, etc. Follow the link to "Styling with Markdown is supported" (which you can always find at the bottom left of an Issue write screen) for an orientation to Github's markdown.*

i-myers commented 5 years ago

Gabrielle Kirilloff's essay "Traversing the Tree" presents a unique response on the history of SGML and XML which eventually tie into OHCO (overlapping hierarchies of content objects). First, she began her essay giving background about both SGML (standard generalized markup language) and the eventual rise of XML (extensive markup language). Although digital scholars began to try and dismiss the problems of both programs, they realized that no matter what they did, any program would eventually overlap into hierarchies. She shows an example of how SGML doesn't allow the user to truly customize their project. However, then clarifies that XML (because it was released at a later date) allows more customization and importantly a lot of room for interpretation.

For example, let's say a scholar was reading text from "Something Wicked This Way Comes" by Ray Bradbury. Now, the language behind the text that was coded allows for the scholar to interpret what Bradbury wrote in the book. If we were on chapter 17 for example where the characters have ran into Mr. Dark, the code would be putting in a lot of (s) after each sentence or (p) for a paragraph if it was in SGML. Now, if this same run-in with Mr. Dark was written in XML, the text could be opened up for further analysis. The scholar could categorize the characters and mention prior information. Perhaps any dark powers possessed by Mr. Dark and Cooger along with the cast of the Pandemonium Shadow Show in the book.

Now, both languages unfortunately will inevitably run into hierarchies. The problem is, simply we just cannot avoid the way that these programs set up their lists of codes. As humans we can interpret many different things in varying perspectives because our brains are more organic and are evolved to do such a thing. Now a computer is an inanimate object, even though it does have a drive that could read them. The problem? Where we are at with technology, the CPU reads things much more differently. Even typing up this keyboard, the CPU doesn't read my mind and immediately puts words on the screen. When I press each key, it sends signals which in turn is read by the CPU as letters and numbers that are supposed to represent words and dialogue in the English language. The only way to truly even get around this hierarchy is if humanity developed machines in a different fashion than how they read and act currently.

ebeshero commented 5 years ago

@i-myers Uh oh--looks like half of your post is crossed out. I had a look at the edit view of your post and I think I see why. When you post elements like <p> with angle brackets, and you haven't wrapped them in markdown "tick" marks, the browser technology doesn't know what to do with them and makes weird errors like this. Take a look at the markdown tutorial here, and especially the section on "Inline Code": https://guides.github.com/features/mastering-markdown/ The tick mark is probably on the top left corner of your keyboard (to the left of the number one).

ebeshero commented 5 years ago

(Edited above to add link to Markdown tutorial.)

amberpeddicord commented 5 years ago

Gabrielle Kirilloff's "" relays the history and uses of XML (and SGML) to emphasize the perspective of the humanist in the digital humanities. In her introduction section, she mentions a common anxiety among scholars that text encoding and the use of computers in analyzing and interpreting literature will remove any human impact on the original texts themselves. However, Kirilloff dispels this fear in the following sections of her paper.

In the section titled The Implications of Encoding, Kirilloff describes text encoding not as a replacement for interpretation of literature and other texts, but as a way for scholars to document the process of their interpretations for digitally literate readers to understand. Through her use of the example of Blake's Songs of Innocence and Experience, she shows the way that this can work. If an original document has subjective, grey-area instances like punctuation on Blake's poem engravings, it is necessary for editors of the text to document their interpretations.

Her perspective, I suppose, is one that blends the digital perspective with the humanist perspective in a well-thought-out way. Speaking from personal experience as a literature major, there can be a tendency in the humanities to "fear" the digital realm. There could be aversions to the use of encoding as a form of literary analysis, but Kirilloff makes it clear that XML markup is a human-driven, often subjective type of document analysis that has benefits for both the interpreter and the reader alike.

amberpeddicord commented 5 years ago

Overlapping Hierarchies

Piez and Kirilloff present to us several instances of overlapping hierarchies within text encoding. The most interesting, in my opinion, is the way that Piez uses these hierarchies in his interpretation of Frankenstein. Throughout these readings, I was having a slight issue understanding this idea of hierarchies and what the different issues and/or uses for them were. But, when it came to the article about Piez's lecture on Frankenstein, I was better able to understand.

I was fascinated with the way he visualized the complexity of narration in the novel by using a method that Mary Shelley could never have imagined. Through digital means, Piez managed to highlight the way that Shelley focused on her monster and made its narrative possible. Obviously, she would have had no concept of these "hierarchies" as we are discussing them today. However, there was a certain complexity to the way that she was writing that the idea of overlapping hierarchies in XML markup was able to bring out.

Is There a "Good" Way to "Model" Overlapping Hierarchies with Code?

I think Kirilloff, in the final sentence of "", said it best: there is no way to determine the "best" way to do any sort of coding without opening up countless other discussions on intentionality and the need being met with each individual code. From what I've gathered in the few articles we read for this discussion, hierarchies (and XML as a whole) are a very nuanced subject. There is no one correct way to write most code, so I would think that the same would apply to hierarchies and overlapping hierarchies as well. If anything, the correct way to write the code for them would be to write well-documented, easily understood code. So, a "good" way to write it would be one that readers could read and see the interpretive process as well as gain an understanding of the text itself.

haggis78 commented 5 years ago

Both Kirilloff and Piez identify as a shortcoming of XML (and, before it, SGML) that they were constructed with the assumption that texts would always be hierarchical in structure. This is not universally the case, creating the problem of "overlapping hierarchies". Kirilloff gives the reader some useful history on this point, demonstrating that this is an inherited problem due to the fact that SGML was initially developed for technical rather than literary texts, and when XML was created for use with literary texts, the problem was not resolved. As a result, humanities computing folks have just learned to live with it, but it remains a sub-optimal situation.

Kirilloff cites as an example a selection of poetry in which a single word is split between two successive lines. The lines must be marked up in such a way that the line break is recognized, yet the fact that overlapping hierarchies are effectively forbidden would make it difficult to mark up the grammatical content of the poem. One might imagine a similar problem in Rupert Brooke's The Soldier, which begins:

If I should die, think only this of me: That there's some corner of a foreign field That is for ever England. There shall be In that rich earth a richer dust concealed;[...]

There is a sentence break in the middle of line 3, so lines cannot easily be made divisions of sentences in an orderly hierarchy.

Piez makes a similar point while looking at texts at the larger level, namely the structural changes that took place during the successive revisions of Frankenstein: "Larger elements lose hierarchy when smaller elements become plentiful". Here whole volumes, sections, and chapters are shuffled around, yet small details of intentional word and line placement are important.

One of the sources with which I work is medieval bishops' statutes for their clergy -- kind of a local rule-book for the church. These are usually divided into chapters, sometimes by the original authors or scribes, and sometimes later on by editors for convenience. But the text may transition from topic A to topic B in the middle of a chapter. This often happens with chapters of the Bible as well, where the chapter breaks, inserted around the year 1200, do not always align well with changes of topic. Both of these are cases in which overlapping hierarchies would be created in markup.

Kirilloff identifies a work-around through the use of self-closing empty element tags. While these may work, they fail to disclose precisely the fact that an overlapping hierarchy has been created by the original author's choices -- and those choices may well be important to understanding the author's intentions.

haggis78 commented 5 years ago

(Sorry, I think I accidentally closed my post to comments, and I can't seem to re-open it. If responding, reply to this thread instead. Thanks. -Dr. Campbell)

haggis78 commented 5 years ago

@i-myers I think that you make an excellent point in saying that computers and our brains are not comparable, one being organic and the other an artificial construct. Even converting a handwritten text (or an oral speech act) to a typed one flattens nuance, and sometimes nuance is everything -- as we have all experienced when someone completely misinterprets, say, a sarcastic remark in a social media post because they can't hear our tone of voice. Or a speaker might try to dominate a meeting by choosing to take breaths in the middle of sentences instead of between them, because that prevents other people in the meeting from getting a word in edgewise (without very obviously interrupting). Hey, look, an overlapping hierarchy problem in the Minutes of the Committee!

This really brings home Kirilloff's point that marking up a text does not edit the human out of the equation, since s/he must make a lot of judgment calls.

bairjon commented 5 years ago

<"Transversing_the_Tree"> is relating to the uses of XML as well as the history as well as the perspectives of digital humanities. XML and OHCO eventually are tied together in overlapping hierarchies. In the "Implications of coding", what is being stated is dependent on how the text is coded it can effect the way it is interpreted by the scholar. I do not necessarily think that there is a wrong way to document text for the digitally literate reader, although it may impact how the scholar may understand the content from person to person and as it is stated, "How the scholar interacts with the process of encoding." From reading this I can also see how within the next couple decades coding literacy will increase as there are more and more people that are switching over to be digital literate. XML allows more customization now which will help with hierarchical structure. XML does have a strong hierarchy structure model but can be a problem for older texts documents as the markup is the embedding of elements in the text.

lmcneil7 commented 5 years ago

Gabrielle Kirilloff's "" provides insight into the history of digital humanities and emphasizes how important the relationship between scholar and mark-up is to the creation of SGML (Standard Generalized Markup Language), XML (eXtensible Markup Language) and OCHO. First, she began asking an essential question: why even bother with coding? She gave insight into criticism or arguments made against digital humanities and provided context to what coding means. Without getting into XML or SGML, she was able to show that computing tools have been essential to us for a while and there's no reason why coding is any different. She also mentioned how encoding can be used for analysis by detailing the relationship between the scholar and the mark-up and its effect on each other. Thus, the creation to SGML and XML.

Her focus is providing us with history for a better understanding of coding itself. For instance, not many people use SGML anymore and while it might seem there's no reason to figure out why, Kirilloff creates a map showing how SGML is the reason that humanists created XML. It's the weaknesses and limitations of SGML that led to XML, but knowing about both actually gives more perception into the minds of the humanists who designed them. Not to mention OCHO is based off SGML and XML's inability to convey hierarchies. It's being able to convey the history without trying to replace everything that it taught. She mentions that XML gets the job, and sometimes that's often. However, humanists are bound by this want of creation that leads to deeper discussions and surveying other options.

ebeshero commented 5 years ago

Nice work so far! Keep the posts coming, and see if you can use markdown to help you work in some code samples. If you try to represent Gabi Kirilloff's title <Traversing_the_Tree/> which is meant to look like a self-closing element, Markdown has trouble with that thinking it's an actual element tag. So use tick marks (look at the guide to Markdown to see how to make those on your keyboard) to make the angle brackets be visible! (That is also how you can share code blocks.)

ebeshero commented 5 years ago

Okay, let's break down this concept of OHCO: Ordered Hierarchy of Content Objects. That's what we make when we prepare an XML document, and what I'm commenting on in your homework so far. The way we make that ordered hierarchy can limit the way we represent a document! Where can it cause trouble? Is it really serious trouble? Or something we can deal with?

lmcneil7 commented 5 years ago

It can cause trouble when elements overlap each other in a way that makes it difficult to portray the attributes. I've noticed that when you're doing multiple attributes on an element, in the hierarchy, only the first attribute shows up. This isn't a problem necessarily if you use the same attribute first consistently, however, what if you don't? Or what if the attribute works for one element, but not another? Then the hierarchy doesn't show them similarly. For instance: "character 'main' Boris" and "character 'dog' Popper". They have the same element, but not the same attribute. The hierarchy will show it like this, but it lacks all the information about the characters that might group them together by limiting the element to one attribute. Kirilloff uses this example in "Traversing_the_Tree": eng-ineers. This shows an error in the editing, but it also shows the issue about the hierarchy. Sometimes in order to avoid the error, you add more information to the element that doesn't need to be under that element. While it shows how elements can overlap, there might be information that is hard to distinguish due to how elements are. I personally don't think it's serious trouble because Kirilloff mentioned that there are still issues even about hierarchies and it just offers up a challenge when you code.

ebeshero commented 5 years ago

@lmcneil7 I think you're talking about the Outline view in oXygen here, right? That's not the only way to see the hierarchy, but it is a survey tool that a lot of us use to survey the XML we've coded. The first attribute might be the one that "shows" there, but the others are available, and possibly easier to see in the document editing view using "pretty print" in oXygen. We will be "drilling down" into elements that have particular attributes on them using a technology called XPath in a few weeks, so that what you "see" in the Outline view isn't the only thing you get to help access an XML document. (That's a sort of pun on WYSIWYG: What You See is What You Get--since the OUtline view is kind of a WYSIWIG tool like your document view on your Desktop).

You make a good point here about how attributes can sort of interrupt the element hierarchy: Sometimes people try to use the same word for an attribute that they use for an element, and that can be confusing because it's hard to tell which concept is more important, the element name or the attribute. In principle the attribute is meant to add some information that the tag name can't convey by itself.

smdunn921 commented 5 years ago

In “” Gabrielle Kirilloff offers a history of XML and SGML: how they are related, how they are used, and problems that lie within. Even in the introduction, she is already explaining that this allows to have more information readily available, though later does stress that this is all dependent on the reader’s interpretation of the text. I like @bairjon's comment about how coding literacy will increase since more people seem to be switching over and that XML can be helpful since it is widely customizable.

ajw120 commented 5 years ago

In "" Kirilloff is able to establish the direct history of XML along with the ideas of SGML. With doing so it allows for direct connections to be made and allowing them to be used in direct circumstances. However, with all direction there is misdirection. Meaning that they have issues and can create difficulties along the way. She explains that all information can become accessible to all readers and doers of such softward, although it truly becomes a matter of how it is truly interpreted. It's also interesting to see the transitions of software and where XML plays a crucial part within this entire community and how it is so widely usable across an abundance of platforms.

jwa32 commented 5 years ago

In "" it states that "SGML was developed to encode texts, its original goals were very different from the way humanities scholars now use XML. SGML was intended to facilitate the sharing and storage of large-project documents in law, government, and industry." Both SGML (Standard Generalized Markup Language) and XML (eXtensible Markup Language) are very similar in regards to them both being used to gather text and information in a easy to read format. However, SGML was used for governmental use only in the 1980's and was much more difficult to read. XML is much easier to use because of its use of end tags and attributes that clearly show where text begins and ends.

smdunn921 commented 5 years ago

Okay, let's break down this concept of OHCO: Ordered Hierarchy of Content Objects. That's what we make when we prepare an XML document, and what I'm commenting on in your homework so far. The way we make that ordered hierarchy can limit the way we represent a document! Where can it cause trouble? Is it really serious trouble? Or something we can deal with?

While XML is easier to use because it is easier to customize, ordered hierarchies can, in fact, cause us to run into issues of overlapping hierarchies. A problem with ordered hierarchies can arise when trying to, in reference to your example in class today, put tags on each line of a poem to show that they are in fact a line. If you put tags around each line in this way, you are then unable to put tags around anything more than what is within that line. The computer will think that the closing tag for the line is what you want for the tags you're using for the big chunk, say (again in reference to your example) a quote. Kirilloff makes mention of this when she states "there are cases in which overlapping hierarchies are created by objects that self-overlap, or overlap with more of the same type of object." You cannot tag big chunks of text, if you already have other tags within it, without running into an issue. As shown in class, there are self closing tags that can be a solution, but different problems can arise with different solutions. They are manageable, but we can't expect to have them completely bug-free anytime soon if ever-- nothing is perfect.

ebeshero / DHClass-Hub

Traversing the Tree and Overlapping Hierarchies #666

Here is our first Discussion assignment for the semester:

The reading:

The discussion prompts:

Overlapping Hierarchies

Is There a "Good" Way to "Model" Overlapping Hierarchies with Code?