TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
273 stars 88 forks source link

the `<affiliation>` element appears somewhat abused; `<bio>` would be better #1686

Closed bansp closed 6 years ago

bansp commented 6 years ago

While my first instinct was to treat the source of this report as a mild and practically harmless case of tag abuse (aaalways close to my heart), I got encouraged to post this as a ticket for the sake of potential improvement of the jTEI tagset. So here it goes.

The <affiliation> is suggested to be the container for author biography statements. Its current definition reads "contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor.", which, it seems to me, is somewhat at ODDs with the concept of a biography, as long as it is agreed that a bio is more than merely a list of affiliations. An author bio relevant for the usual text that may be encoded with the jTEI tagset (article, conference abstract, essay?) may mention affiliations, but if so, it should also most often mention the position or role in which that person serves in the institution at hand, and may also mention this person's professional (and non-professional but still relevant to the topic) interests, describe their background (rather important!), mention their achievements, and may even go as far as to add that they are "a father of four, currently stationed in a suburb of Valetta, Malta" (is the last bit relevant to a scientific article? most probably not; can it be relevant to the subject matter of an essay? it can; are such fragments attested? oh yes; do they have anything to do with the current definition of the <affiliation> element? ...)

I would like to suggest either that the definition of <affiliation> is somehow convincingly extended to be able to handle the various possible departures from a strict enumeration of affiliations (and let me quickly add that I don't believe that that can be done convincingly), or that an element <bio> is added for biographical statements.


Annex

The following are examples taken from jTEI bio statements that, I understand, are encoded as <affiliation> but do not quite conform to the definition and perhaps would be better encoded as parts of <bio>:

laurentromary commented 6 years ago

in #542 I had suggested to have the biography information (<occupation>) as a child of <author>. This could be an answer for jTEI at least.

arojascastro commented 6 years ago

I think we need a bio element. At the moment, I am encoding prosographic information about mythological characters and I am using occupation to encode a short description but I think bio would be better. For instance:

Dido

<person xml:id="did029">
                        <persName>Dido</persName>
                        <occupation>En fuentes griegas y romanas, también fue conocida como Elisa de Tiro; aparece como la fundadora y primera reina de Cartago, en el actual Túnez. Su fama se debe principalmente
                            al relato incluido en la <hi rend="italic">Eneida</hi> del poeta romano Virgilio. Era hija del rey de Tiro, Matán I, también llamado Belo. Dido tenía dos hermanos: Pigmalión, que heredó
                            el trono de Tiro, y la pequeña Anna Perenna.</occupation>
                    </person>

Cupid

<person xml:id="cup026">
                        <persName>Cupido</persName>
                        <occupation>Cupido (llamado también Amor en la poesía latina) es, en la mitología romana, el dios del deseo amoroso. Según la versión más difundida, es hijo de Venus, la diosa del amor, la
                            belleza y la fertilidad, y de Marte, el dios de la guerra. Se le representa generalmente como un niño alado, con los ojos vendados y armado de arco, flechas y aljaba. Su equivalente en
                            la mitología griega es Eros.</occupation>
                    </person>

occupation should be restricted to work and affilition to any relation established between a person with an organzation.

lb42 commented 6 years ago

Surely <affiliation> and <occupation and others are all intended to be used to record information about a single occupation, affiliation, etc.? Which suggests that using any of them to enclose a short summary of a person's bio or salient characteristics would be abusive: such a summary would typically contain or reference several affiliation/occupations. So I think there's a clear use case for providing a new <shortBio> element. Or using the existing <summary> element perhaps.

arojascastro commented 6 years ago

Agreed. shortBio, summary or even description would work. But since it is prosopographic information I prefer shortBio or bio.

lb42 commented 6 years ago

What should the content model of shortBio be? Should it contain affiliation and occupation as well as paragraphs?

sydb commented 6 years ago

But the name <shortBio> (or <bio>, since there’s no particular reason it would be short) applies only to prosopographic information. The same describe-the-thing-in-prose structure should be available to <place>, <org>, and <event>, too (and perhaps others). Indeed, these 3 elements already allow <desc> children. So perhaps <person> should, too. Cf. #367.

lb42 commented 6 years ago

I don't know why <desc> and <summary> are missing as children of <person>, but I don't think the proposed <shortBio> is quite the same thing. A description of a person might well include aspects (e.g. physical appearance) which wouldn't be included in the <shortBio>, certainly not as posited by the OP. One can imagine something analogous to a shortBio for a place, though not perhaps an event.

martindholmes commented 6 years ago

I prefer <bio>, because there's no reason to impose an arbitrary restriction on length, no way to know what "short" actually means, and if it is "short" by some arbitrary definition, that's perfectly obvious to an encoder or a processor anyway.

The bulk of the discussion is about what happens inside <person>, but the use of <affiliation> Piotr is (rightly) complaining about is in the <author> element. We would need to make sure that <bio> is available in both. The content model of <author> is macro.phraseSeq, so either we're adding <bio> to that (any reason why not?), or complicating the content model of <author> (and presumably <editor> too).

laurentromary commented 6 years ago

I perfectly agree with Martin on both points. That would help a lot.

arojascastro commented 6 years ago

In my opinion, <bio> may contain <persName>, <placeName>, <orgName>, <occupation> and <affiliation> (and even <event>) more or less like <floruit>. This is useful if you are building a simple prosopographic description that only contains the name of the person and a short bio in prose (for instance, a list of mythological characters mentioned in a poem).

However, please do not change the content model of <occupation> and <affiliation>, because you may have a more structured prosopography with no <bio> in prose but rather a list of <event>s, <occupation>s and <affiliation> and other elements.

lb42 commented 6 years ago

I suggested "shortBio" rather than "bio" because the original use case was not to contain a full biography, nor even a full prosopographic entry, but rather the sort of biographical note typically associated with a publication (e.g. "Lou Burnard is now on gardening leave") Maybe "biogNote" would be a better name. Then it could be a member of model.noteLike, which would make it available it in all the places it's been requested so far, and avoid confusing it with <person> which is what I would use for a canonical prosopographic description.

martindholmes commented 6 years ago

After thinking about Lou's comment, I wonder if the jTEI problem could be solved by simply using a <note type="bio"> instead of <affiliation>?

jamescummings commented 6 years ago

I agree with @martindholmes ... What is a bio except a short note giving the biography of the person? To me I've used note inside person for both editorial notes and biographical notes and just distinguished by type attribute.

martindholmes commented 6 years ago

I'd like to get @rvdb 's view on this -- Ron, what do you think? I'm convinced that we're abusing <affiliation>, and I think <note type="bio"> is the preferred solution so far; do you see any issues with jTEI encoding and processing here?

arojascastro commented 6 years ago

I do not think a <note type="bio"> is the same as a <bio>. A <note> is a very generic element. I have nothing against using <note> but when you have huge prosopographic description with a hundred of elements you usually end up with a TEI file full of <notes> because the TEI does not cover all the aspects that you have (and it has not to cover them all, of course). But I think it is good to provide an element with a semantic value if possible and limit the use of <note> to things that are too specific to a particular project.

bansp commented 6 years ago

It wouldn't be the 1st time for the TEI to have specialized elements in parallel with typed generic elements available as well, think of <gram> as one example.

rvdb commented 6 years ago

A small aside concerning the jTEI issue: I admit that the jTEI use of <affiliation> is based on its interpretation in the OpenEdition documentation as an element for "author description" (which in turn may have been interpreted too broadly by us again), whereas <orgName> is documented as the OpenEdition element for expressing an author's "affiliation". Of course, this is no excuse and we should come to a satisfactory solution.

Concerning a short-term solution: I agree that <note type="bio"> could provide a ready fix. Am I right in assuming that <affiliation> is then a suitable container for the <roleName> someone takes on in an <orgName>? E.g.:

<note type="bio">Laurent Romary is <affiliation><roleName>Directeur de Recherche</roleName> 
  at <orgName>Inria</orgName>, France</affiliation>, <affiliation><roleName>director 
  general</roleName> of the European infrastructure <orgName>DARIAH</orgName></affiliation>, 
  and <affiliation><roleName>guest scientist</roleName> at the <orgName>Centre Marc 
  Bloch</orgName> and the <orgName>Academy of Sciences in Berlin</orgName></affiliation>. He 
  carries out research on the modeling of semi-structured documents, with a specific emphasis on texts 
  and linguistic resources. <!-- further loose biographic description --></note>
martindholmes commented 6 years ago

Thinking of the broader use of the jTEI schema, I think I would like to keep <affiliation> as a sibling to <note type="bio"> for the purpose of the one-line institutional affiliation which is often used instead of a bio or as part of a byline. There's no reason not to use it inside the <note> as well, of course. People may have (or have had) multiple affiliations, but often want to be identified as belonging to one specific institution by default.

bansp commented 6 years ago

In conference-related and similar contexts, that one institution may actually insist on being singled out, indeed.

laurentromary commented 6 years ago

I would definitely keep the possibility to separate the two issues (affiliation, bio) even is a bio is referring to elements related to affiliations (as in the above mentioned example, which omits some possible affiliations of mine). The argument is the there are many situations where journals or conference do not have bios at all.

martindholmes commented 6 years ago

OK, so the next question would be what content-models the proposed <bio> element needs to be part of. I assume it needs to be a child of <person>, <author> and <editor>; one complication is the fact that roles similar to <author> and <editor> are handled with <respStmt>, which has a much simpler structure; in that case, are we expecting people to link to a <person> element which has the <bio>?

lb42 commented 6 years ago

(a) I think the element should be called biogNote (b) I think it should be a member of model.noteLike

FrederikeNeuber commented 6 years ago

Excuse me if I just drop into the discussion. I also have a question about <affiliation> in jTEI, even if it's a little bit different from the discussion before. If you think, it might be better to open a separate thread please let me know. The Institute for Documentology and Editing is currently using an customized TEI scheme for the contributions in the digital journal RIDE (ride.i-d-e.de). In the future, we would like to switch to the jTEI scheme, but information that our scheme currently covers should not disappear. We already tested new ways to represent the current information in jTEI. However, one information that the jTEI schema does not cover and that we currently provide is a <placeName> within <affiliation>. Right now, we're encoding the affiliation of an author as follows:

          <affiliation>
            <orgName>Royal Irish Academy</orgName>
            <placeName>Dublin, Ireland</placeName>
          </affiliation>

With jTEI it would only be possible to encode it as follows::

          <affiliation>
            <orgName>Royal Irish Academy</orgName>
          </affiliation>

I think the inclusion of <placeName> in <affiliation> would make sense. What do you think?

martindholmes commented 6 years ago

@FrederikeNeuber this makes perfect sense; please open a separate ticket for it. We're anxious to keep developing the jTEI schema so that it's more generally useful.

martindholmes commented 6 years ago

To add to @lb42 's suggestion:

(a) I think the element should be called biogNote (b) I think it should be a member of model.noteLike (c) I think it should be a member of att.typed (brief bio, full bio, academic bio) (d) I think it should be a member of att.datable (bio from this date to that date, bio notBefore today)

sydb commented 6 years ago

I’m still not entirely convinced by @arojascastro ’s argument that <biogNote> is sufficiently superior to <note type="bio"> that a new TEI element is warranted.

But if we do create a new element then

(a) <biogNote> seems fine (b) A member of model.noteLike? I.e., allowed as a child of <app>, <classCode>, <fw>, <metDecl>, and <surfaceGrp>? Surely part of the argument for making a new element is that we can allow it in intelligent places and not in silly ones. (c) att.typed seems like a good idea (d) att.datable seems a little dicier. Is the duration specified by @from/@to the period of a person’s life the biography covers, or the era from which biography is written? E.g., a bio of Henry Kissinger written in 2005 would have a very different take on the end of the Vietnam war than one written today, now that we know he deliberately extended the war to get Nixon elected. And what does <biogNote notBefore="2016-01-01" notAfter="2016-12-31"> mean? (In general, @notBefore and @notAfter specify a range of dates within which the date of interest occurred; i.e., it’s not a specification of a period or duration.)

arojascastro commented 6 years ago

Let me put it in a different way: why do we have a floruit element rather than a note type="floruit"? I think the content model should be similar. Hoewer, if people are not 100% sure about this change, I can use one more note.

martindholmes commented 6 years ago

@arojascastro I think we have <floruit> because it reflects a tradition of labelling a specific date-range with that word; it's syntactic sugar for <date type="floruit"> rather than <note type="floruit">.

@martinascholger I don't think there's any consensus on what to do here; we either need to create the new <biogNote> element, in which case Council needs to approve it officially, or we need to recommend instead the use of <note type="bio">, and adopt that in jTEI. So I think Council needs to rule on that issue, then we can move forward. I'll set this to Needs Discussion so Council can talk about it, and unassign myself; once the decision is made, feel free to assign it back to me to implement.

alex-bia commented 6 years ago

This ticket has been discussed by the Council on Feb 25th 2018, with the following conclusions:

  1. <affiliation> should be used just to determine an affiliation, avoiding abuses.
  2. <placeName> has already been added to <affiliation>. See ticket #1692
  3. For biographical information in contexts like jTEI, use <note type="bio">.
  4. Biography in personography is a new and bigger issue. If current elements are not enough a new ticket should be opened.