Conal-Tuohy / VMCP-upconversion

Ferdinand von Mueller's correspondence upconversion from MS Word to TEI XML
Apache License 2.0
3 stars 2 forks source link

Hypertext style #21

Closed LucasHorseshoeBend closed 7 years ago

LucasHorseshoeBend commented 7 years ago

I had a look at a sample of the letters in the set created by final status and hypertext style. I was expecting 24 hits, and there were 25. However, this is a coincidence: ones I expected were like http://vmcp.conaltuohy.com/xtf/view?docId=tei/1890-6/1893/93-05-27a-final.xml

Ones that I wasn't expecting included accidental links left in when editing include http://vmcp.conaltuohy.com/xtf/view?docId=tei/1850-9/1857/57-03-05-final.xml see fn 3

If you are able to distinguish between cases where the text contains http:// &c and those that don't it would facilitate clean-up. While the links like the second example are not a fatal problem, they might create an expectation in the reader than will not be met. If this is difficult, then their are few enough to work on manually.

There also seemed to have been some false positives: http://vmcp.conaltuohy.com/xtf/view?docId=tei/1850-9/1850/50-02-18-final.xml

Conal-Tuohy commented 7 years ago

The 'false positive' actually does contain a hyperlink in footnote 10

LucasHorseshoeBend commented 7 years ago

Damn, missed it even though I looked a number of times.

Conal-Tuohy commented 7 years ago

Yes, certainly, I can add a feature to distinguish hyperlinks whose display text is just a URL from those where the text of the hyperlink is a label (not just the URL).

Could you please clarify the point about the reader's expectation? Just so I understand the issue involved here. I am assuming that you will want to ensure that any hyperlink has a textual label, and avoid the problems associated with what Wikipedia calls "bare URLs".

NB on the web, a hyperlink is normally displayed only as a textual label (with some kind of highlighting such as an underline to show that it is a link), but when displayed in other media (such as in print), the same hyperlink could be rendered in full, with the URL also spelled out, as a formal citation, e.g.

Stenochilus <http://www.ipni.org/ipni/idPlantNameSearch.do?id=27205-1&back_page=%2Fipni%2FeditSimplePlantNameSearch.do%3Ffind_wholeName%3DSte*chil*%26output_format%3Dnormal>

LucasHorseshoeBend commented 7 years ago

Readers' expectations: This is not a technical concern, although dead links could add to the problem. My concern is that a "thinking" reader who finds that some plant names in the text (and they will mostly be plant names I think) will wonder what is the signifigance of this name having a link, and the thousands of others not. That is, they may try to fathom what the editors' had in mind, when really the editors had just been careless!! So I need to remove those instances.

Bare URLs. I take the point of Wikipedia essay, and will check that we cite correctly so that the reference can be recovered.

Because of the natue of the judgements, it will need editorial inspection to remedy, hence my request to distinguish between the two classes of hypertext style.

Conal-Tuohy commented 7 years ago

OK I have split the "hyperlink" value in the "Features" facet into two distinct values. "Bare hyperlink" (the text of the link starts with http:) and "Labelled hyperlink" (the text of the link is something else; presumably a human-readable label). How's that?

LucasHorseshoeBend commented 7 years ago

I've looked at a sample of each, and its an excellent guide. Thanks.