clarin-eric / parla-clarin

Schema for modelling parliamentary debates
https://clarin-eric.github.io/parla-clarin/
21 stars 6 forks source link

Change to adresees, questions and answers (from AKN) #8

Open TomazErjavec opened 4 years ago

TomazErjavec commented 4 years ago

In Sec. 8.1.2. "Converting addressee, role, questions and answers" (from AKN), we have the (here slightly simplified) AKN example

<speech by="#khalwale" to="#speaker">
  <p>Mr. Speaker, Sir, I beg to give notice of the following Motion:-</p>
   ...
</speech>

and state that "TEI does not have attributes directly corresponding to the AKN ... @to". Our solution then uses <relation> to encode the addressee of the question as follows:

<u who="#khalwale">
 <seg>Mr. Speaker, Sir, I beg to give notice of the following Motion:-</seg>
 ...
<listRelation>
  <relation name="directedTo"
   active="#khalwale" passive="#speaker"/>
 </listRelation>
</u>

However, this is wrong, as TEI does have an attribute "directly corresponding to the AKN ... @to", i.e. the attribute @toWhom, which was defined and added to various elements (including <u>) in 2018 (cf. https://github.com/TEIC/TEI/issues/1679). (@TomazErjavec is very sorry that he overlooked this addition)

So, our solution needs to be changed to the much simpler:

<u who="#khalwale" toWhom="#speaker>
 <seg>Mr. Speaker, Sir, I beg to give notice of the following Motion:-</seg>
 </u>

The above is a simple change. However, we have another example, which is, in AKN, the following:

<question eId="question_1" by="#kappa" to="#ministerEducation">
  <p>I would like to ask the Minister for Education about ...</p>
</question>
<answer eId="answer_1" by="#eta" as="#ministerEducation">
  <p>Mr. Speaker, BNAT was the only umbrella professional body for...</p>
</answer>

which as encoded (also in https://clarin-eric.github.io/parla-clarin/#sec-qa) as:

<u xml:id="question_1" who="#kappa">
  <seg>Mr. Kappa asked the Minister for Education and ...</seg>
  <listRelation>
    <relation name="directedTo" active="#kappa" passive="#ministerEducation"/>
  </listRelation>
</u>
<u xml:id="answer_1" who="#eta" ana="#ministerEducation">
  <seg>Mr. Speaker, BNAT was the only umbrella professional body for ...</seg>
  <listRelation>
    <relation name="questionAnswer" active="#question_1" passive="#answer_1"/>
  </listRelation>
</u>

Here the main problem is that we, in fact, do not even now encode that something is a question, except if we take the directedTo to mean that the utterance is a question, but that is very questionable, as an utterance can be directed to somebody, without it being a question.

The only solution that I can come up with is to assume that we have a taxonomy, defining various types of utterances (such as questions and answers) and then point to it via the @ana attribute. The conversion to TEI would then be:

<u xml:id="question_1" who="#kappa" toWhom="#ministerEducation" ana="#question">
  <seg>I would like to ask the Minister for Education about ...</seg>
</u>
<u xml:id="answer_1" who="#eta" ana="#ministerEducation #answer" toWhom="#kappa">
  <seg>Mr. Speaker, BNAT was the only umbrella professional body for ...</seg>
  <listRelation>
    <relation name="questionAnswer" active="#question_1" passive="#answer_1"/>
  </listRelation>
</u>

The downside is that and coversion from AKN needs to have this taxonomy "hard wired", i.e. inserted automatically into the TEI, and that the @ana attribute has several values, which is unpleasant to parse.

Also note, that the TEI encoding in fact has in it more information than the original AKN, as it explicitly links the question and answer, whereas AKN does this implicitly, via the ordering and adjecency (so the answers immediattely follows the question) of the speeches.

I have now (in 700a93e) changed the documentation as suggested above, but can of course change it again, in case anyone has a better suggestion.

What I did not do is to change the AKN2TEI conversion, and I would ask @andrejPancur to do that (if he agrees with the proposed solution, of course).