bodleian / medieval-mss

Medieval Manuscripts in Oxford Libraries: TEI catalogue descriptions
https://medieval.bodleian.ox.ac.uk
33 stars 38 forks source link

q and lb elements #196

Closed holfordm closed 5 years ago

holfordm commented 5 years ago

I would like to suggest a stricter interpretation (more in keeping with the guidelines) of these elements than I think current practice reflects, at least for the medieval catalogue, but possibly for other catalogues as well, which has implications for the convert2html process.

<q> (quoted) contains material which is distinguished from the surrounding text using quotation marks or a similar method - this should generate and around the text (+0145 and +0146). It should be used, e.g. for quoting the text of inscriptions and pressmarks in the provenance section.

<lb> (line beginning) marks the beginning of a new (typographic) line in some edition or version of a text. this should typically be used in transcribing incipits, explicits and similar; it is equivalent to the | symbol typically used for this purpose in printed catalogues, and should be rendered by that symbol in the html.

if we agree to implement these changes, they will need me to to some tidying up in the xml before the schema/convert2html are updated.

andrew-morrison commented 5 years ago

So far, the q element has only been used in Medieval. The XSL transforms them into HTML span elements, but no CSS is set up to display them differently. Quote marks could be inserted by the XSL, although there is always the possibility of doubling if someone enters them in the TEI too.

The script I wrote to extract a simplified XML for conversion to RDF for the MMM project looks for quote marks in provenance elements and regards them as inscriptions. Let me know if you change those to use q elements instead. A @type attribute would further clarify things.

andrew-morrison commented 5 years ago

Apart from Medieval, lb has been used by @camformig in some South Asian records he has just uploaded. So he might wish to comment. For example, this file, in which they are numbered.

The lb element is currently transformed into a HTML br, which tells browsers to display a line break. It would be possible to use | in Medieval and something else in other catalogues.

camformig commented 5 years ago

In the South Asian catalogue there would be no need to signal the line breaks by means of a glyph, as they are already marked manually with [1r2] etc. In theory, we could also use a hyphen for the SA catalogue, but by no means the | symbol, since it is used in the transliteration to represent a very common type of punctuation symbol in South Asian scripts.

ahankinson commented 5 years ago

I think @andrew-morrison made a good point about double-encoding. If the q element is used to mark up quotations then there would be no need to also use the quote marks. These can be inserted in the HTML transform or even added as a css style.

ahankinson commented 5 years ago

@camformig I think the suggestion is that the line breaks are indicated with the TEI element, not a glyph. They can then be rendered in the catalogue in whichever way makes sense for your content. This would effectively replace the 'traditional' use of the | character.

In your case, it would make the encoding here somewhat redundant:

https://github.com/bodleian/south-asian-mss/blob/master/collections/MS-Chandra-Shum-Shere/ms-chandra-shum-shere-d-247.xml#L84

In theory, there is no need to include [128v21] in the text, since you could include this information in the <locus> attributes.

I also note that in this same file there is the use of the <quote> element, instead of the <q> element.

https://github.com/bodleian/south-asian-mss/blob/master/collections/MS-Chandra-Shum-Shere/ms-chandra-shum-shere-d-247.xml#L144

Is this correct for the consolidated schema?

holfordm commented 5 years ago

Agreed re. double-encoding - existing quote marks would be replaced. Good idea about use of @type.

quote vs. q: my reading of the guidelines is that quote is for quoted passages or phrases from a known author, text, etc., q is more generally for anything that would appear in quote marks.

Rendering of the lb element may vary according to whether it is used in the manuscript description or the text transcription, as well as from catalogue to catalogue. We could use the rend attribute to specify different displays, e.g. rend="verticalLine" for medieval manuscript description.

camformig commented 5 years ago

@ahankinson I realize now that I had to be clearer in my comment, apologies for it. I use lb for the machine and [XrY] or [XvY] for humans. I always think both in terms of encoding and rendering, so the line numbers are for the humans, the lbs for the machine. Why don't I use locus then? Because it's tedious and long in terms of encoding and if I'm not sure that the viewer with which the images will be displayed will zoom at line level, I don't see the point in wasting time doing it. If you can assure me that the viewer will zoom at line level, I can change my way of encoding. Otherwise, since I always use the same pattern, in case that in the future you will use a viewer with line-level zoom capacity, it will be possible to replace with a regex all combinations of lb[XrY] or lb[XvY] with a locus element in one go.

My use of quote is in line with the TEI guidelines and I have never used q until now, I didn't need it. I agree with all has been said regarding both q and quote, but I suggest to use the @type instead of @rend with q, since the latter refers to the rendition of quotations in the source text.

holfordm commented 5 years ago

@andrew-morrison I've begun changing ' and ‘’ to q where relevant; it will take a while to complete. The end result should be a lot easier/more interesting to parse for provenance evidence.

holfordm commented 5 years ago

There are no instances of double encoding remaining, so we can go ahead and update the html transformation (without which, the records I have converted will look peculiar when they are reindexed)

andrew-morrison commented 5 years ago

I'll add a template for q to insert quote marks.

What is the decision on lb? Should I leave it inserting HTML br elements, but override that for Medieval only?

holfordm commented 5 years ago

That would work for me. I'm not exactly sure what the desired output in Sanskrit would be.

Get Outlook for Androidhttps://aka.ms/ghei36


From: Andrew Morrison notifications@github.com Sent: Wednesday, January 2, 2019 2:20:49 PM To: bodleian/medieval-mss Cc: holfordm; Author Subject: Re: [bodleian/medieval-mss] q and lb elements (#196)

I'll add a template for q to insert quote marks.

What is the decision on lb? Should I leave it inserting HTML br elements, but override that for Medieval only?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/bodleian/medieval-mss/issues/196#issuecomment-450874842, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ATThVDLNyWopMsdc3HNzx3vfzfjC1nAAks5u_MBBgaJpZM4ZNdhm.

camformig commented 5 years ago

It's fine for me as well.

On Thu, 3 Jan 2019, 11:23 holfordm <notifications@github.com wrote:

That would work for me. I'm not exactly sure what the desired output in Sanskrit would be.

Get Outlook for Androidhttps://aka.ms/ghei36


From: Andrew Morrison notifications@github.com Sent: Wednesday, January 2, 2019 2:20:49 PM To: bodleian/medieval-mss Cc: holfordm; Author Subject: Re: [bodleian/medieval-mss] q and lb elements (#196)

I'll add a template for q to insert quote marks.

What is the decision on lb? Should I leave it inserting HTML br elements, but override that for Medieval only?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub< https://github.com/bodleian/medieval-mss/issues/196#issuecomment-450874842>, or mute the thread< https://github.com/notifications/unsubscribe-auth/ATThVDLNyWopMsdc3HNzx3vfzfjC1nAAks5u_MBBgaJpZM4ZNdhm

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bodleian/medieval-mss/issues/196#issuecomment-451117076, or mute the thread https://github.com/notifications/unsubscribe-auth/AnQLRbNk8eL6cvSZhUSN5otTRXOIRXlyks5u_ehFgaJpZM4ZNdhm .

andrew-morrison commented 5 years ago

Medieval-only lb handling now on QA. Here are the affected records:

http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_10000 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_10004 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_10019 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_10176 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_1430 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_3939 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_6590 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_675 http://medieval-qa.bodleian.ox.ac.uk/catalog/manuscript_984

holfordm commented 5 years ago

something is not quite right about how the q elements are rendering: the closing apostrophe appears lower than the opening apostrophe. e.g. https://github.com/bodleian/medieval-mss/issues/196, in the Hand(s) section.

andrew-morrison commented 5 years ago

It is lower, but that is the style in which apostrophes are rendered by the font (PT Sans?) selected for normal paragraph text by the CSS. The XSL change I made above simply inserts the characters copied and pasted from your first comment above (but, thanks to GitHub's CSS, those are rendered in a monospace font.) I could change it to insert straight single quotes instead, but that would differ from the instances where curved quotes have been entered into the TEI.