haggis78 / BreconChurch

Files for our DH project on Henry VIII's Letter Patent founding Brecon Collegiate Church in Wales.
3 stars 0 forks source link

Semester project evaluations #37

Open ebeshero opened 4 years ago

ebeshero commented 4 years ago

Greetings, Brecon Team! I'll be posting our own in-house evaluation of the Brecon project soon in a markdown file to your repo, but this project was so ambitious and unusual as to call for some outside help in the review process. I asked our friend @djbpitt, who is himself a scholar of medieval texts, to respond to the Brecon project and offer some advice for next steps, especially considering that @haggis78 will be talking about this project at a conference in the near future. He wrote back with the following comments and suggestions:


The project is generally of high quality with respect to research, use of technology, and design, and I am both happy and not at all surprised to read that it will be traveling to conferences in the future. Here are a few thoughts about areas that could be improved, in no particular order, which I hope the developers will find useful:

  1. I ran the site through https://datayze.com/site-validator.php, and most of the "errors" were actually warnings, which can be ignored. But some matter:, e.g., the @alt attribute is missing from many <img> elements, and the project-team about page omits an equal sign in <a href"https://www.greensburg.pitt.edu/academics/center-digital-text"> (which raises several error messages, but they are all side effects of this one mistake—which they should have caught when authoring the page in <oXygen/>).

  2. The ubiquitous use of the term "anonymous block" in the write-up is a mistake because it is TEI jargon that will not be familiar to subject-are specialists. Likewise the reference to <app> elements on the graphs page. It's sensible to describe how the analysis was performed using markup terms, but I'd suggest segregating that perspective, which isn't helpful for subject-area users who are not familiar with markup.

  3. The colons after the section headings on http://brecon.newtfire.org/html/history.html are odd. When formatted as headings, as they are, they should omit the colon. If they were inline labels at the beginning of a paragraph, the colon would be appropriate.

  4. Links like "click here" (at the bottom of the http://brecon.newtfire.org/html/history.html, and elsewhere) are poor style because the text should make narrative sense on its own. The team can get rid of both "click" and "here" by changing it to something like "for more information about these documents see our 'About these documents' page", the last part of which would be a clickable link. "No page should ever say 'click here'" would be good general advice to project teams.

  5. http://brecon.newtfire.org/html/Glossary.html might look better with HTML list markup (<ul>, or perhaps even <dl>, with CSS to modify the formatting if the default doesn't look right, for the first part). 


  6. The timeline would be easier to use if it were horizontal and compressed to fit without scrolling. The ellipses would look better if they were wider.

  7. The graph page says that "there are significantly more variations in anonymous blocks 6 and 13", but a) "significantly" is a technical term, and they don't explain how it is significant, and b) 11 and 12 have higher counts that 6, so I'm confused about why 6 was singled out. I think they mean that it's a local peak, although that seems to be true of 3, as well. It's good that they acknowledge that raw counts can be misleading, although I don't understand the meaning of "slightly" where they write that this "skewed our results slightly". Ultimately, they've identified a place for improvement, and I don't think the improvement will be difficult for them to implement once they decide what to do (which, I realize, may not have been possible within the artificial time constraint of the fifteen-week academic semester). Are variants per anonymous block (whether length-normalized or not) the best measure of variation hot spots? What if the <ab> boundaries are ignored and the density of variation is measured continuously from the beginning to end of the edition? Or is <ab> a meaningful subdivision from the perspective of philological analysis?

  8. The stemma is lovely, and the textual writeup is clear and sophisticated. The diagonal values are all the same, so the developers can remove the note to themselves about debugging on the textual writeup page, but the table stretches horizontally, and should be styled to make it easier to read. Color-shading the table, as in a heatmap, might also help users see past the wall of numbers, which can be difficult to understand at a high level, however important the counts are for a lower-level perspective. A stemma is typically based on patterns of agreement that distinguish shared innovations from shared retentions, and the extent to which the developers have made that distinction isn't clear to me. The distinction may not affect the outcome, but it still matters methodologically, and sort of distinguishes a dendrogram based on similarity or distance, on the one hand, from a stemma, which is hypothesis not just about patterns of agreement, but also about genealogy. While the table shows counts of agreements, it doesn't show clusters, that is, not just how often do A and B, B and C, and A and C agree, but also how often those are the same agreements. Some sort of social network visualization, which can represent relationships above the pair level, might be helpful here. 

  9. Vertical spacing is often uncomfortably large; this is easy to see on the page with source descriptions because of the short line and paragraph lengths there. As noted above, headers on their own lines should not be followed by colons; inline labels are properly followed by colons, but the colons should be followed by spaces, and they aren't on tis page. URLs should be live links. Each entry in the list of sources could incorporate a live link to the transcription of that source. This internal linking is part of a "digital workstation" perspective (which some call a "rabbit hole").

  10. The separate transcription pages of the individual witnesses should provide access to the bibliographic / codicological metadata, so that users don't have to switch to the separate sources page to remind themselves of which witness is which. Highlighting areas subject to variation is sensible, although insofar as simple color highlighting is unobtrusive, they might want to leave it always on, instead of using a checkbox toggle. They could enrich the edition by making the variants accessible from the individual reading views, perhaps on mouseover or in a dynamic sidebar panel to avoid cluttering the page for those who just want to read the witness by itself. Those variants could be linked to the other witnesses (more "rabbit hole"), although that would take some planning, since how best to do it is not immediately obvious (well, to me). 

  11. Given the fairly small number of witnesses, would an interlinear collation ("alignment table" in CollateX terms) be helpful? I used that as the primary visualization of variation in the PVL, and while it may not be suitable for all manuscript traditions, it seems as if it might be a good option for this one.

  12. The prose description on the anonymous block comparison page is difficult to understand, even for me. They may have written that at the last moment, when the meaning was clear to them because it was their work, but although there's a lot of data on this page, it isn't communicated well. I don't know what "string count" means. Character count? If so, is it white-space normalized? Is it the entire <ab>, or does an <ab> contain multiple strings? The bar charts aren't well integrated; they aren't aligned with the sections to which they correspond, and if you click both of the checkboxes and zoom all the way out, it looks as if there is more information than will fit on the page horizontally. When I click the two checkboxes repeatedly in different orders I often wind up with more than two columns of bar charts, although because I have to zoom out to see them, I can't tell whether there is repetition or wrapping or something else going on. This page should include interpretation, that is, what the tables and charts tell us that is philologically interesting. Otherwise it's sort of a data dump.

I hope this is helpful! All in all, it's a terrific project.


@haggis78 @amberpeddicord @ChinoyIndustries @alnopa9 @KSD32

ebeshero commented 4 years ago

The interlinear collation that @djbpitt mentions above is represented here on his "PVL" project on the Russian Primary Chronical: http://pvl.obdurodon.org/pvl.html It's basically a collation table that runs line by line with the different witnesses represented as stacked. I think we discussed this early on, but may be worth revisiting now.

djbpitt commented 4 years ago

Try http://pvl.obdurodon.org/browser.xhtml

ebeshero commented 4 years ago

Here is my evaluation of the project, now posted as a markdown file to this repo: https://github.com/haggis78/BreconChurch/blob/master/breconEvaluation-12-2019.md

@haggis78 @amberpeddicord @alnopa9 @ChinoyIndustries @djbpitt @KSD32

ebeshero commented 4 years ago

I’ve just updated my project evaluation with some more suggestions on the comparative reading view.