clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

GB Feedback #619

Closed matyaskopp closed 1 year ago

matyaskopp commented 1 year ago

speeches quantity

Speeches quantity is not counted in ParlaMint finalization - it shouldn't necessarily be equal to the number of <u> https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB_2021-04-12-commons.ana.xml#L23

            <measure unit="speeches" quantity="0">0 speeches</measure>

corpus timespan

data is expected to be up to 2022

https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB.xml#L47

            <bibl>
               <title type="main">Parliament of Great Britain: Daily sessions</title>
               <publisher>Houses of Parliament</publisher>
               <idno type="URI">https://hansard.parliament.uk/</idno>
               <date from="2015-01-05" to="2021-02-25">2015-01-05 - 2021-02-25</date>
            </bibl>

https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB.xml#L188

         <settingDesc>
            <setting>
               <name type="city">London</name>
               <name type="place">Westminster</name>
               <date from="2015-01-01" to="2021-03-31"/>
            </setting>
         </settingDesc>

setting country

add setting country, like here: https://github.com/clarin-eric/ParlaMint/blob/729dd572332bcd11412b8d20c6a9e94c9c8bf097/Data/ParlaMint-UA/ParlaMint-UA.xml#L119-L125

government + lower house + upperhouse

There should be two types of organizations.

  1. Government of the United Kingdom / His Majesty's Government
  2. Parliament of UK that consists of House of Commons and House of Lords

But in your data there is a mixture of it, government that is called parliament and has terms: https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB.xml#L197-L216

<org xml:id="PoGB" role="government">
   <orgName full="yes">Parliament of the United Kingdom of Great Britain and Northern Ireland</orgName>
   <listEvent>
      <event xml:id="PoGB.54" from="1900-01-01" to="2010-05-18">
         <label>Fifty-fifth Parliament of the United Kingdom</label>
      </event>
      <event xml:id="PoGB.55" from="2010-05-18" to="2015-03-30">
         <label>Fifty-fifth Parliament of the United Kingdom</label>
      </event>
      <event xml:id="PoGB.56" from="2015-05-27" to="2017-05-03">
         <label>Fifty-sixth Parliament of the United Kingdom</label>
      </event>
      <event xml:id="PoGB.57" from="2017-06-21" to="2019-11-06">
         <label>Fifty-seventh Parliament of the United Kingdom</label>
      </event>
      <event xml:id="PoGB.58" from="2019-12-17">
         <label>Fifty-eighth Parliament of the United Kingdom</label>
      </event>
   </listEvent>
</org>

this should be probably moved to lower house...

Furthermore, there are two strange terms (label and timespans)

      <event xml:id="PoGB.54" from="1900-01-01" to="2010-05-18">
         <label>Fifty-fifth Parliament of the United Kingdom</label>
      </event>
      <event xml:id="PoGB.55" from="2010-05-18" to="2015-03-30">
         <label>Fifty-fifth Parliament of the United Kingdom</label>
      </event>

Government usually have a different timespans from the parliament terms. And I have no idea how House of Lords works - do they have the same terms as house of commons?

last affiliation date

Current term did not end today (2023-03-14), so to should be removed https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB.xml#L350-L353

                  <affiliation from="2019-12-12"
                               ref="#parliament.HC"
                               role="member"
                               to="2023-03-14"/>

meeting should correspond with parliament, not government

https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB_2021-04-12-commons.ana.xml#L13

<meeting n="58" corresp="#parliament.HC" ana="#parla.lower #parla.term #PoGB.58"/>

add meeting sitting

add meeting element that contains information about sitting, eg:

<meeting n="2021-04-12" corresp="#parliament.HC" ana="#parla.lower #parla.sitting"/>

for this file: https://github.com/matthewcoole/ParlaMint/blob/8adde61ed73da3c337e064f849933a5d7544744a/Data/ParlaMint-GB/ParlaMint-GB_2021-04-12-commons.ana.xml

matthewcoole commented 1 year ago

I think I've made most of these changes now in https://github.com/clarin-eric/ParlaMint/pull/617.

For the earlier Parliaments/Governments, I haven't had time to add data for all of them, so there are catch-alls. This was just to eliminate the warnings for affiliations people had listed before 2015. (I can add full details for these before we submit the final corpus if necessary).

For the number of speeches, I can't think of other ways to do this besides counting <u>. The only other thing that could be counted is the number of debates?

TomazErjavec commented 1 year ago

For the number of speeches, I can't think of other ways to do this besides counting <u>.

There was some debate about counting only those <u> where the speaker changes (as a speech by one person could be split in processing) but we (i.e. I) in fact never implemented that, so just counting <u> elements is fine.

matyaskopp commented 1 year ago

@matthewcoole , ticks are updated. Some easy-fix issue remains.

@TomazErjavec, Can you include copying the number of <u> occurrences if //measure[@unit="speeches"]/@quantity = 0 into your finalization script? I think it makes sense.

TomazErjavec commented 1 year ago

@TomazErjavec, Can you include copying the number of <u> occurrences if //measure[@unit="speeches"]/@quantity = 0 into your finalization script? I think it makes sense.

It does. Done now in devel, 2d9f602.