welfare-state-analytics / riksdagen-corpus

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today
Other
26 stars 5 forks source link

fix: (sample) fix split introductions #430

Closed BobBorges closed 8 months ago

BobBorges commented 8 months ago

Fix incorrectly split introductions. Sample will follow.

Closes #429

BobBorges commented 8 months ago

Sampled changes

corpus/protocols/199091/prot-199091--019.xml

Diff starting from line 6329

@@ -6374,12 +6329,7 @@
               mitt krav på återremiss av ärendet till konstitutionsutskottet.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-PpeJiz3xsu6VeUYLZpJ18C">
-            Anf. 78 ELISABETH FLEETWOOD (m)
-          </note>
-          <note xml:id="i-AoQ2zhu9Qk6J8jrtMNbk12">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-PpeJiz3xsu6VeUYLZpJ18C">Anf. 78 ELISABETH FLEETWOOD (m) replik:</note>
           <u xml:id="i-4253bd90736d046a-615" who="Q4948276" next="i-4253bd90736d046a-616">
             <seg xml:id="i-4Yz5jnaWqfGmnm7yUTe7Uc">
               Fru talman! Låt mig börja med att säga, eftersom

corpus/protocols/199091/prot-199091--090.xml

Diff starting from line 1964

@@ -2014,12 +1964,7 @@
           <note xml:id="i-6JrwPZo7htLqQqbtsFuGD7">
             8 § Svar på fråga 1990/91:500 om postutdelning nattetid
           </note>
-          <note xml:id="i-2cPCXrYswYUc2dBdTTAet1" type="speaker">
-            Anf. 39 Kommunikationsminister GEORG
-          </note>
-          <note type="speaker" xml:id="i-RMDrq41yBcwScRNphUt5oi">
-            ANDERSSON (s):
-          </note>
+          <note xml:id="i-2cPCXrYswYUc2dBdTTAet1" type="speaker">Anf. 39 Kommunikationsminister GEORG ANDERSSON (s):</note>
           <u xml:id="i-0aa692f356fe3ede-205" who="Q5554537" next="i-0aa692f356fe3ede-206">
             <seg xml:id="i-3S8jznvdNGqncdhxJx6eFD">
               Herr talman! Sigrid Bolkéus har frågat mig om försöket med postutdelning

corpus/protocols/199091/prot-199091--095.xml

Diff starting from line 939

@@ -944,12 +939,7 @@
               situation i FN?
             </seg>
           </u>
-          <note xml:id="i-9iPN5TQFKAjsHTkKpDK6aa" type="speaker">
-            Anf. 9 Utrikesminister STEN ANDERSSON
-          </note>
-          <note xml:id="i-LkGebsaDgxjrfgBH3p2WpJ" type="speaker">
-            (s):
-          </note>
+          <note xml:id="i-9iPN5TQFKAjsHTkKpDK6aa" type="speaker">Anf. 9 Utrikesminister STEN ANDERSSON (s):</note>
           <u xml:id="i-ead05e5b976eb4eb-103" who="Q1346478" next="i-ead05e5b976eb4eb-104">
             <seg xml:id="i-SF7XYNEAE7RBXYJoXYXs4m">
               Herr talman! Jag vill ta fasta på det som Gullan

corpus/protocols/199192/prot-199192--015.xml

Diff starting from line 2307

@@ -2357,12 +2307,7 @@
             7 § Svar på interpellation 1991/92:1 om regionalpolitiken och
             Norrbotten
           </note>
-          <note xml:id="i-Dt7A9BUxU1Cm4jMce43wUK" type="speaker">
-            Anf. 22 Arbetsmarknadsminister BÖRJE
-          </note>
-          <note type="speaker" xml:id="i-DQjbeN7GaGPeBai9Tv3nTa">
-            HÖRNLUND (c):
-          </note>
+          <note xml:id="i-Dt7A9BUxU1Cm4jMce43wUK" type="speaker">Anf. 22 Arbetsmarknadsminister BÖRJE HÖRNLUND (c):</note>
           <u xml:id="i-c29002b104fc9c44-246" who="Q5005171" next="i-c29002b104fc9c44-247">
             <seg xml:id="i-EpUUTRfaaWq3CyR8DbZLG3">
               Fru talman! Sten-Ove Sundström har frågat mig

corpus/protocols/199192/prot-199192--037.xml

Diff starting from line 3068

@@ -3258,12 +3068,7 @@
               gäller åtgärder för att förbättra situationen.
             </seg>
           </u>
-          <note xml:id="i-X1wKbjrttnWLNRghMAEggG" type="speaker">
-            Anf. 82 Näringsminister PER WESTERBERG
-          </note>
-          <note xml:id="i-3qnT7cxZLc2CCZw1vWEsij" type="speaker">
-            (m):
-          </note>
+          <note xml:id="i-X1wKbjrttnWLNRghMAEggG" type="speaker">Anf. 82 Näringsminister PER WESTERBERG (m):</note>
           <u xml:id="i-02adb08626132d2e-375" who="Q1375199" next="i-02adb08626132d2e-376">
             <seg xml:id="i-5pMhNUBK7A3qSzv8Ruxabp">
               Fru talman! Regeringen kommer att vara passiv och kommer icke

corpus/protocols/199192/prot-199192--057.xml

Diff starting from line 4006

@@ -4126,12 +4006,7 @@
               i finansplanen.
             </seg>
           </u>
-          <note xml:id="i-SSHWHgFSxoJ2JvikyH9i2t" type="speaker">
-            Anf. 64 Finansminister ANNE WIBBLE
-          </note>
-          <note xml:id="i-UYtHShS6z8GevxkfynbyBQ" type="speaker">
-            (fp):
-          </note>
+          <note xml:id="i-SSHWHgFSxoJ2JvikyH9i2t" type="speaker">Anf. 64 Finansminister ANNE WIBBLE (fp):</note>
           <u xml:id="i-79722b13fe37ec5a-437" who="unknown" next="i-79722b13fe37ec5a-438">
             <seg xml:id="i-GDbHJhfhDpXaQzfJrA8DBH">
               Herr talman! Kvar står, vilket jag har sagt ett par gånger, att

corpus/protocols/199192/prot-199192--070.xml

Diff starting from line 9352

@@ -9392,12 +9352,7 @@
               kunna ta ställning och följa upp den fortsatta debatten.
             </seg>
           </u>
-          <note xml:id="i-2qkF9a9SqVd73McsAtWLQd" type="speaker">
-            Anf. 60 Statsrådet ULF DINKELSPIEL
-          </note>
-          <note xml:id="i-MKtryX679vgq388hHPUb2x" type="speaker">
-            (m):
-          </note>
+          <note xml:id="i-2qkF9a9SqVd73McsAtWLQd" type="speaker">Anf. 60 Statsrådet ULF DINKELSPIEL (m):</note>
           <u xml:id="i-2d9aeaa84848d41a-1150" who="Q5622212" next="i-2d9aeaa84848d41a-1151">
             <seg xml:id="i-N2rSgXuN4ZEXaUyMJFPRgE">
               Herr talman! Låt mig säga till Bengt Dalström att jag delar uppfattningen

corpus/protocols/199192/prot-199192--094.xml

Diff starting from line 15238

@@ -15283,12 +15238,7 @@
               Herr talman! Jag yrkar bifall till utskottets hemställan.
             </seg>
           </u>
-          <note xml:id="i-BTMrtfEr4wZ1JMAdkvdTRu" type="speaker">
-            Anf. 171 HARRIET COLLIANDER (nyd)
-          </note>
-          <note xml:id="i-XcTCdE4DTpKdUpcW26FPHz">
-            replik:
-          </note>
+          <note xml:id="i-BTMrtfEr4wZ1JMAdkvdTRu" type="speaker">Anf. 171 HARRIET COLLIANDER (nyd) replik:</note>
           <u xml:id="i-9114d807f0cfbecd-1737" who="Q4943258">
             <seg xml:id="i-YD436KECsTaTbDpQoDtoet">
               Herr talman! Det är mycket lyckosamt att tidningarna i stödområdena

corpus/protocols/199192/prot-199192--098.xml

Diff starting from line 6537

@@ -6542,12 +6537,7 @@
               man slippa sådana skandaler som har dykt upp från och till.
             </seg>
           </u>
-          <note xml:id="i-EXQfCA3WBdqQ1VgusaaogS" type="speaker">
-            Anf. 56 KARL-GÖRAN BIÖRSMARK (fp)
-          </note>
-          <note xml:id="i-4taVdqS8XgG5dTNpUaUCPi">
-            replik:
-          </note>
+          <note xml:id="i-EXQfCA3WBdqQ1VgusaaogS" type="speaker">Anf. 56 KARL-GÖRAN BIÖRSMARK (fp) replik:</note>
           <u xml:id="i-2bae394ebbe4570b-803" who="Q5597554" next="i-2bae394ebbe4570b-804">
             <seg xml:id="i-NBQBRLk31py5uW9MQFJsKH">
               Herr talman! Lars Moquist sade att det är svårt att hålla reda

corpus/protocols/199192/prot-199192--123.xml

Diff starting from line 7880

@@ -8030,12 +7880,7 @@
               haltande.
             </seg>
           </u>
-          <note xml:id="i-LvT8qrbyPi3kT5d3MKESYi" type="speaker">
-            Anf. 117 KARL-GÖRAN BIÖRSMARK (fo)
-          </note>
-          <note xml:id="i-QFXuVgyDdijoUkqYW3FjD">
-            replik:
-          </note>
+          <note xml:id="i-LvT8qrbyPi3kT5d3MKESYi" type="speaker">Anf. 117 KARL-GÖRAN BIÖRSMARK (fo) replik:</note>
           <u xml:id="i-f9f6c038953d0dda-938" who="Q5597554" next="i-f9f6c038953d0dda-939">
             <seg xml:id="i-4x7CtYhHuTXsDnio4tNTSu">
               Herr talman! Jag tog fasta på vad Hugo Hegeland sade, nämligen

corpus/protocols/199293/prot-199293--090.xml

Diff starting from line 3928

@@ -4038,12 +3928,7 @@
               hur välfärden skall beräknas i olika delar av världen?
             </seg>
           </u>
-          <note xml:id="i-RemP6Xgijgwxo96K57ML7L" type="speaker">
-            Anf. 63 Miljöminister OLOF JOHANSSON
-          </note>
-          <note xml:id="i-4FeEa2EvkyGUGoA6pgMJN5" type="speaker">
-            (c):
-          </note>
+          <note xml:id="i-RemP6Xgijgwxo96K57ML7L" type="speaker">Anf. 63 Miljöminister OLOF JOHANSSON (c):</note>
           <u xml:id="i-011ad14fc89c76c7-434" who="Q2021126" next="i-011ad14fc89c76c7-435">
             <seg xml:id="i-C1TmH22Dv3YQ227G4HnSLx">
               Fru talman! Lena Klevenås tar upp en viktig principfråga som

corpus/protocols/199394/prot-199394--006.xml

Diff starting from line 17333

@@ -17548,12 +17333,7 @@
               miljarder kronor.
             </seg>
           </u>
-          <note xml:id="i-Ez56duDbJj7s3srarygoWs" type="speaker">
-            Anf. 196 Kulturminister BIRGIT FRIGGEBO
-          </note>
-          <note xml:id="i-G8j4CUpJV3c7GMfURpTYhE" type="speaker">
-            (fp):
-          </note>
+          <note xml:id="i-Ez56duDbJj7s3srarygoWs" type="speaker">Anf. 196 Kulturminister BIRGIT FRIGGEBO (fp):</note>
           <u xml:id="i-e0eb23c37a15b947-2204" who="Q4916267" next="i-e0eb23c37a15b947-2205">
             <seg xml:id="i-Gb27TvhkuMfhcbnrUbnooi">
               Fru talman! Lars Moquist tycker att det är bra att vi skall gå

corpus/protocols/199394/prot-199394--015.xml

Diff starting from line 944

@@ -989,12 +944,7 @@
               som är avgörande vid fördelningen av radiotillstånd?
             </seg>
           </u>
-          <note xml:id="i-9UzdC6P6AC3P96M9y71PyX" type="speaker">
-            Anf. 20 Kulturminister BIRGIT FRIGGEBO
-          </note>
-          <note xml:id="i-NJdm8T6y7Yd5cRGasrarE7" type="speaker">
-            (fp):
-          </note>
+          <note xml:id="i-9UzdC6P6AC3P96M9y71PyX" type="speaker">Anf. 20 Kulturminister BIRGIT FRIGGEBO (fp):</note>
           <u xml:id="i-caaebaa8363cf7ec-100" who="Q4916267" next="i-caaebaa8363cf7ec-101">
             <seg xml:id="i-RZdyCjc9V6i38834hNFKTH">
               Fru talman! Det är inte gratis att starta tidningar i det här

corpus/protocols/199394/prot-199394--016.xml

Diff starting from line 13106

@@ -13196,12 +13106,7 @@
               och drivit fram frågorna riktigt stämmer.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-F8P6zBiEasaLYMgvMD1P6y">
-            Anf. 148 MARIANNE ANDERSSON (c)
-          </note>
-          <note xml:id="i-6bGiTGa5pFkCfUwEBGVoym">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-F8P6zBiEasaLYMgvMD1P6y">Anf. 148 MARIANNE ANDERSSON (c) replik:</note>
           <u xml:id="i-a8160ee09eee5685-1491" who="Q4935892">
             <seg xml:id="i-B8Ems5eXyHyDLugkS8u32x">
               Herr talman! Jag är beredd att ge regeringen och departementet

corpus/protocols/199394/prot-199394--078.xml

Diff starting from line 16209

@@ -16409,12 +16209,7 @@
               mycket frustrerande.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-8qazm3ZcWDNvMSEm619sLn">
-            Anf. 144 CHARLOTTE BRANTING (fp)
-          </note>
-          <note xml:id="i-4J7Hexfpuvs6PdYWF7D5o9">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-8qazm3ZcWDNvMSEm619sLn">Anf. 144 CHARLOTTE BRANTING (fp) replik:</note>
           <u xml:id="i-6f3667d3dc405c3f-1805" who="Q4987718" next="i-6f3667d3dc405c3f-1806">
             <seg xml:id="i-WZTTw6J8VtUY1fk6TXy3Kt">
               Herr talman! Jag har tydligen trampat på en öm tå, eftersom Elisabeth

corpus/protocols/199394/prot-199394--081.xml

Diff starting from line 11085

@@ -11160,12 +11085,7 @@
               vi hittar någon sådan.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-WgzY6uzYPyMaXE29rZHhp7">
-            Anf. 118 LENNART BRUNANDER (c)
-          </note>
-          <note xml:id="i-TB86AXMo87mWY75CnPSo24">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-WgzY6uzYPyMaXE29rZHhp7">Anf. 118 LENNART BRUNANDER (c) replik:</note>
           <u xml:id="i-a982592111b14db1-1312" who="Q5588228" next="i-a982592111b14db1-1313">
             <seg xml:id="i-EyizFKMrDP3E4zMEAiNeiB">
               Herr talman! Inte heller jag tror att vi hittar några forskare

corpus/protocols/199394/prot-199394--082.xml

Diff starting from line 2261

@@ -2276,12 +2261,7 @@
               nytt har litet ymnigare källor att ösa ur.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-E9QRJvqAP5mLrGhvV3SKKH">
-            Anf. 28 KURT OVE JOHANSSON (s)
-          </note>
-          <note xml:id="i-6Jm95GhHNbz7grz9bZxcUq">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-E9QRJvqAP5mLrGhvV3SKKH">Anf. 28 KURT OVE JOHANSSON (s) replik:</note>
           <u xml:id="i-a4225b718484761a-253" who="Q5886858" next="i-a4225b718484761a-254">
             <seg xml:id="i-C8xt9tJXu2NnZkjafqsQop">
               Fru talman! Visst är budgeten också för oss en helhet, Bertil

corpus/protocols/199394/prot-199394--096.xml

Diff starting from line 3172

@@ -3232,12 +3172,7 @@
           <note xml:id="i-Vr2siKRRSd25zmY89W78tb">
             Bosnien-Hercegovina m.m.
           </note>
-          <note xml:id="i-MPMQiaVPM7DS3Uk44uWUto" type="speaker">
-            Anf. 29 Utrikesminister MARGARETHA AF
-          </note>
-          <note type="speaker" xml:id="i-2mdJ8NXNAkBhPbBzsuFHAU">
-            UGGLAS (m):
-          </note>
+          <note xml:id="i-MPMQiaVPM7DS3Uk44uWUto" type="speaker">Anf. 29 Utrikesminister MARGARETHA AF UGGLAS (m):</note>
           <u xml:id="i-1f76dc07d70584cc-372" who="Q455820" next="i-1f76dc07d70584cc-373">
             <seg xml:id="i-GAt9mePXNfZEvHdMv33EGp">
               Herr talman! Lennart Rohdin har ställt två frågor till mig om

corpus/protocols/199394/prot-199394--116.xml

Diff starting from line 16227

@@ -16332,12 +16227,7 @@
               Margitta Edgren fullt ut i dessa frågor.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-ULVYgsn8zbyowzwm2bqXg1">
-            Anf. 145 MARGITTA EDGREN (fp)
-          </note>
-          <note xml:id="i-6cwYSmmpWdg68h1Y6jhn1Z">
-            replik:
-          </note>
+          <note type="speaker" xml:id="i-ULVYgsn8zbyowzwm2bqXg1">Anf. 145 MARGITTA EDGREN (fp) replik:</note>
           <u xml:id="i-06d6c4968f05fbd5-1809" who="Q4945703" next="i-06d6c4968f05fbd5-1810">
             <seg xml:id="i-7MeU3ESrkqumzv7vbLxiuE">
               Herr talman! När jag förberedde mig för debatten tänkte jag på

corpus/protocols/199495/prot-199495--038.xml

Diff starting from line 1887

@@ -1947,12 +1887,7 @@
               mötes på den punkten?
             </seg>
           </u>
-          <note xml:id="i-PBgGFLGvEoRCLmBCC7jYC4" type="speaker">
-            Anf. 62 Arbetsmarknadsminister ANDERS
-          </note>
-          <note type="speaker" xml:id="i-W9135yKFRyubH877TFKZjh">
-            SUNDSTRÖM (s):
-          </note>
+          <note xml:id="i-PBgGFLGvEoRCLmBCC7jYC4" type="speaker">Anf. 62 Arbetsmarknadsminister ANDERS SUNDSTRÖM (s):</note>
           <u xml:id="i-91ed86f69af0226b-214" who="Q6196549" next="i-91ed86f69af0226b-215">
             <seg xml:id="i-4J67fVwsgCmqwxYA2CGFEG">
               Fru talman! Jag går här inte in på varje enskild kommun och församling.

corpus/protocols/199495/prot-199495--082.xml

Diff starting from line 8639

@@ -8849,12 +8639,7 @@
               olika juridiska företagsformer kan komma i andra hand i det läget.
             </seg>
           </u>
-          <note xml:id="i-FibdzkD13uy7du5di8PKxw" type="speaker">
-            Anf. 146 KARL-GÖSTA SVENSON (m)
-          </note>
-          <note xml:id="i-SZc2njBBMHyfpr6MPYNGgL">
-            replik
-          </note>
+          <note xml:id="i-FibdzkD13uy7du5di8PKxw" type="speaker">Anf. 146 KARL-GÖSTA SVENSON (m) replik</note>
           <u xml:id="i-519b2b123837f0d8-909" who="Q6198732" next="i-519b2b123837f0d8-910">
             <seg xml:id="i-BCpVYv2KFuJcC9zd9ogx3D">
               Fru talman! Det är inget fel att vi har dessa bestämmelser. Det

corpus/protocols/199495/prot-199495--086.xml

Diff starting from line 12363

@@ -12423,12 +12363,7 @@
               är inte tillräckligt.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-P1j3mCCAFVHASq7R7F2vwj">
-            Anf. 114 CATARINA RÖNNUNG (s)
-          </note>
-          <note xml:id="i-UXP1egNf2zpwxN1Xx7Xm6B">
-            replik
-          </note>
+          <note type="speaker" xml:id="i-P1j3mCCAFVHASq7R7F2vwj">Anf. 114 CATARINA RÖNNUNG (s) replik</note>
           <pb facs="http://data.riksdagen.se/fil/60149515-BC2C-4548-9188-990C9CC7CD16#page=118"/>
           <u xml:id="i-ee96e96bb4e3fb3c-1312" who="Q4976340" next="i-ee96e96bb4e3fb3c-1313">
             <seg xml:id="i-SwXANqtV8xT6SZibJVYbnQ">

corpus/protocols/199495/prot-199495--087.xml

Diff starting from line 5653

@@ -5688,12 +5653,7 @@
               därför att denna låsning inte är förnuftig.
             </seg>
           </u>
-          <note type="speaker" xml:id="i-Cc3j4UWDXwJmxce5QH9c9u">
-            Anf. 73 CHARLOTTA L BJÄLKEBRING (v)
-          </note>
-          <note xml:id="i-Tfsy8mpjYqdj6VuSQCLsPD">
-            replik
-          </note>
+          <note type="speaker" xml:id="i-Cc3j4UWDXwJmxce5QH9c9u">Anf. 73 CHARLOTTA L BJÄLKEBRING (v) replik</note>
           <u xml:id="i-5cdec66a6162fa21-599" who="Q4939232" next="i-5cdec66a6162fa21-600">
             <seg xml:id="i-P1p6wGN8uHG81NmDMSdLNh">
               Fru talman! Anders Nilsson gör mig åtminstone litet lycklig,

corpus/protocols/199495/prot-199495--089.xml

Diff starting from line 1706

@@ -1736,12 +1706,7 @@
               inte enbart bör följa utvecklingen utan också aktivt påverka den?
             </seg>
           </u>
-          <note xml:id="i-ACsWQJfMS4cmjt9h1QGVU5" type="speaker">
-            Anf. 37 Justitieminister LAILA
-          </note>
-          <note xml:id="i-T2pQtHaKwQ355hXd8DDLRf" type="speaker">
-            FREIVALDS (s)
-          </note>
+          <note xml:id="i-ACsWQJfMS4cmjt9h1QGVU5" type="speaker">Anf. 37 Justitieminister LAILA FREIVALDS (s)</note>
           <u xml:id="i-802575bc744562cb-143" who="Q291648" next="i-802575bc744562cb-144">
             <seg xml:id="i-B9LQpj5V6NZAs3n4mf48w5">
               Herr talman! Huruvida straffskärpningen haft någon effekt eller

corpus/protocols/199495/prot-199495--102.xml

Diff starting from line 12069

@@ -12119,12 +12069,7 @@
               den.
             </seg>
           </u>
-          <note xml:id="i-YJ2976UjQ7GTfe1REXhsMh" type="speaker">
-            Anf. 126 ANN-KRISTINE
-          </note>
-          <note xml:id="i-KnyMi5yZ8fwvpGoJnv7gd9" type="speaker">
-            JOHANSSON (s)
-          </note>
+          <note xml:id="i-YJ2976UjQ7GTfe1REXhsMh" type="speaker">Anf. 126 ANN-KRISTINE JOHANSSON (s)</note>
           <u xml:id="i-58ad18cb04b50f5c-1223" who="Q3363834" next="i-58ad18cb04b50f5c-1224">
             <seg xml:id="i-VEyV9LeFqwJdu96xGbZMYq">
               Fru talman! Ekologisk odling är viktig, och det har vi debatterat

corpus/protocols/199596/prot-199596--087.xml

Diff starting from line 3317

@@ -3357,12 +3317,7 @@
               äventyrar det svenska beståndet av morkulla.
             </seg>
           </u>
-          <note xml:id="i-WWMJFmTrRzDTvkR4tUfk8o" type="speaker">
-            Anf. 39 Jordbruksminister ANNIKA
-          </note>
-          <note xml:id="i-6iP2CuzzXLjWGe6yd9gU6c" type="speaker">
-            ÅHNBERG (s)
-          </note>
+          <note xml:id="i-WWMJFmTrRzDTvkR4tUfk8o" type="speaker">Anf. 39 Jordbruksminister ANNIKA ÅHNBERG (s)</note>
           <u xml:id="i-dd60a80c3c9bbf93-354" who="Q4991027" next="i-dd60a80c3c9bbf93-355">
             <seg xml:id="i-FtM4vmDGuwBjv6fAJ1VY7z">
               Herr talman! Låt mig först säga att det nog är min uppfattning

corpus/protocols/199697/prot-199697--025.xml

Diff starting from line 3976

@@ -3986,12 +3976,7 @@
               Riksbankens sida inte utnyttjar möjligheterna till förhandslåsningar.
             </seg>
           </u>
-          <note xml:id="i-8e25SkVuZ2fW2wyuSEKV2W" type="speaker">
-            Anf. 30 Finansminister ERIK ÅSBRINK (s)
-          </note>
-          <note xml:id="i-UWs5YRruhptW5vAeYAjDPn">
-            replik:
-          </note>
+          <note xml:id="i-8e25SkVuZ2fW2wyuSEKV2W" type="speaker">Anf. 30 Finansminister ERIK ÅSBRINK (s) replik:</note>
           <u xml:id="i-d2c9627ff2a7facb-464" who="Q5388933" next="i-d2c9627ff2a7facb-465">
             <seg xml:id="i-V1WMue3hw5vsi4Vet9WsUs">
               Herr talman! Jag vill ta upp två saker. För det första är det

corpus/protocols/199899/prot-199899--054.xml

Diff starting from line 2169

@@ -2184,12 +2169,7 @@
               de från fall till fall när det inte finns någon generell princip?
             </seg>
           </u>
-          <note xml:id="i-AXQDcjEbVoftEqtg7J76ds" type="speaker">
-            Anf. 26 Näringsminister BJÖRN ROSEN-
-          </note>
-          <note xml:id="i-X1KCXkZTWvWsThBuqGttM7" type="speaker">
-            GREN (s):
-          </note>
+          <note xml:id="i-AXQDcjEbVoftEqtg7J76ds" type="speaker">Anf. 26 NäringsministerBJÖRN ROSENGREN (s):</note>
           <u xml:id="i-bf20be62fa363da6-233" who="Q3374466" next="i-bf20be62fa363da6-234">
             <seg xml:id="i-GsYj5AcSLcRkWaGm6S2Chp">
               Fru talman! De kriterier som gäller på svensk arbetsmarknad är

corpus/protocols/199899/prot-199899--069.xml

Diff starting from line 5121

@@ -5281,12 +5121,7 @@
               tillräckliga skäl?
             </seg>
           </u>
-          <note xml:id="i-My4rzs6rRALvuuHu35FL71" type="speaker">
-            Anf. 71 Miljöminister KJELL LARS-
-          </note>
-          <note xml:id="i-LVpbgU3hRDKbpdSajWZ8NT" type="speaker">
-            SON (s):
-          </note>
+          <note xml:id="i-My4rzs6rRALvuuHu35FL71" type="speaker">Anf. 71 MiljöministerKJELL LARSSON (s):</note>
           <u xml:id="i-d4df4d3edea5aedb-542" who="Q5937929" next="i-d4df4d3edea5aedb-543">
             <seg xml:id="i-AGT6jHZx6zsWY64YmgtdfD">
               Fru talman! Den ståndpunkt jag har kommit fram till när jag studerat

corpus/protocols/199899/prot-199899--075.xml

Diff starting from line 13857

@@ -13962,12 +13857,7 @@
               Centern lyssnade så att de vet vad de har åstadkommit.
             </seg>
           </u>
-          <note xml:id="i-69dLfb2Qgx3JeWM6QCn6GJ" type="speaker">
-            Anf. 224 NILS-GÖRAN HOLMQVIST (s)
-          </note>
-          <note xml:id="i-GfUkfk8WgTCA8aJuZCnh2x">
-            replik:
-          </note>
+          <note xml:id="i-69dLfb2Qgx3JeWM6QCn6GJ" type="speaker">Anf. 224 NILS-GÖRAN HOLMQVIST (s) replik:</note>
           <u xml:id="i-e9c073b1f18cf769-1459" who="Q5811951" next="i-e9c073b1f18cf769-1460">
             <seg xml:id="i-HSqTdLewWqmGyDXwPJhtiR">
               Fru talman! Jag vet att Ingegerd Saarinens parti,

corpus/protocols/199899/prot-199899--095.xml

Diff starting from line 4663

@@ -4678,12 +4663,7 @@
               redan när den infördes missgynnade vissa tidningar och tidskrifter?
             </seg>
           </u>
-          <note xml:id="i-8kqB39QrUhPnVH2psPcpxc" type="speaker">
-            Anf. 64 Finansminister BOSSE RING-
-          </note>
-          <note xml:id="i-TpgRkqYyeWiRiKKnVYyzMc" type="speaker">
-            HOLM (s):
-          </note>
+          <note xml:id="i-8kqB39QrUhPnVH2psPcpxc" type="speaker">Anf. 64 FinansministerBOSSE RINGHOLM (s):</note>
           <u xml:id="i-ac3dffa08e4ed22f-475" who="Q321595" next="i-ac3dffa08e4ed22f-476">
             <seg xml:id="i-PbNdtQGBC8xorhG49HtafK">
               Fru talman! I mitt svar till Anne-Katrine Dunker säger jag mycket

corpus/protocols/19992000/prot-19992000--051.xml

Diff starting from line 3614

@@ -3684,12 +3614,7 @@
               bränslen. Jag menar att dessa effekter är oacceptabla.
             </seg>
           </u>
-          <note xml:id="i-YQvdoXSskXQwr1ESc2GvN6" type="speaker">
-            Anf. 31 Näringsminister BJÖRN ROSEN-
-          </note>
-          <note xml:id="i-PVaptNkGBoBPiJ8t6MtJNX" type="speaker">
-            GREN (s):
-          </note>
+          <note xml:id="i-YQvdoXSskXQwr1ESc2GvN6" type="speaker">Anf. 31 NäringsministerBJÖRN ROSENGREN (s):</note>
           <u xml:id="i-458ede90aff58bd5-352" who="Q3374466">
             <seg xml:id="i-Ro3uZJ1XZvCsDH4THxyVv8">
               Fru talman! Med den ytterligare avreglering som nu sker på elmarknaden

corpus/protocols/19992000/prot-19992000--088.xml

Diff starting from line 5489

@@ -5494,13 +5489,8 @@
               försvarsbeslut.
             </seg>
           </u>
-          <note xml:id="i-61jBS68whHxoFn8FWfxnLb" type="speaker">
-            Anf. 46 Försvarsminister BJÖRN VON SY-
-          </note>
+          <note xml:id="i-61jBS68whHxoFn8FWfxnLb" type="speaker">Anf. 46 FörsvarsministerBJÖRN VON SYDOW (s):</note>
           <pb facs="http://data.riksdagen.se/fil/494C2D44-3D3E-4B13-8E20-D3A3FAEBBA9F#page=45"/>
-          <note xml:id="i-LWrWxsQaDhSHfNeFFS8cXx" type="speaker">
-            DOW (s):
-          </note>
           <pb facs="http://data.riksdagen.se/fil/494C2D44-3D3E-4B13-8E20-D3A3FAEBBA9F#page=49"/>
           <u xml:id="i-08a689f0d5ef8424-516" who="Q879773" next="i-08a689f0d5ef8424-517">
             <seg xml:id="i-M3HaDJTS6k7dxepqipE6tN">

corpus/protocols/19992000/prot-19992000--089.xml

Diff starting from line 1625

@@ -1635,13 +1625,8 @@
               utvecklingen och för konsumenterna.
             </seg>
           </u>
-          <note xml:id="i-VJ2gx2HsTXRzd8hAAPnzVT" type="speaker">
-            Anf. 5 Näringsminister BJÖRN ROSEN-
-          </note>
+          <note xml:id="i-VJ2gx2HsTXRzd8hAAPnzVT" type="speaker">Anf. 5 NäringsministerBJÖRN ROSENGREN (s):</note>
           <pb facs="http://data.riksdagen.se/fil/FB7F12A3-D3BB-4432-804A-C0B158E651AF#page=8"/>
-          <note xml:id="i-KbACHcsqY3x6yoaYNWKQxq" type="speaker">
-            GREN (s):
-          </note>
           <pb facs="http://data.riksdagen.se/fil/FB7F12A3-D3BB-4432-804A-C0B158E651AF#page=13"/>
           <u xml:id="i-a3c5275df305d1c6-95" who="Q3374466" next="i-a3c5275df305d1c6-96">
             <seg xml:id="i-Tp4xAa6wJXSXLodtQBi9L3">

corpus/protocols/200001/prot-200001--002.xml

Diff starting from line 192

@@ -192,12 +192,7 @@
           <note xml:id="i-WVMCtS42DAjCPDaHS9HYkp">
             2 § Regeringsförklaring
           </note>
-          <note xml:id="i-ByKSfXguqgHWLS2qv2uyyw" type="speaker">
-            Anf. 3 Statsminister GÖRAN PERS-
-          </note>
-          <note xml:id="i-N2pHmjJhffTnaR49Y4Lhf8" type="speaker">
-            SON (s):
-          </note>
+          <note xml:id="i-ByKSfXguqgHWLS2qv2uyyw" type="speaker">Anf. 3 StatsministerGÖRAN PERSSON (s):</note>
           <note xml:id="i-2etJjW3DW82LEDcoLpSuRe">
             Eders Majestäter, Eders Kungliga Högheter, fru talman, ledamöter
             av Sveriges riksdag!

corpus/protocols/200001/prot-200001--006.xml

Diff starting from line 501

@@ -501,12 +501,7 @@
           <note xml:id="i-Wkvb1KBpRmU6qXmrbsLeEb">
             4 § Svar på interpellation 2000/01:10 om byggkostnader
           </note>
-          <note xml:id="i-WHRoKTuSbXBKgMt4jkcacK" type="speaker">
-            Anf. 1 Finansminister BOSSE RING-
-          </note>
-          <note xml:id="i-7LCXS7EJTrZ31fxVXD3Prg" type="speaker">
-            HOLM (s):
-          </note>
+          <note xml:id="i-WHRoKTuSbXBKgMt4jkcacK" type="speaker">Anf. 1 FinansministerBOSSE RINGHOLM (s):</note>
           <u xml:id="i-b3ac0943a4daadb8-19" who="Q321595" next="i-K4qPx7Cb4wfroVsd6Uo16i">
             <seg xml:id="i-KHjWw3XsLEMdqJTHT3dwke">
               Fru talman! Ulla-Britt Hagström har frågat statsrådet Lars-Erik

corpus/protocols/200001/prot-200001--045.xml

Diff starting from line 12987

@@ -13002,12 +12987,7 @@
               som barnen ges möjlighet till stimulerande pedagogisk verksamhet.
             </seg>
           </u>
-          <note xml:id="i-P6nUkeunxJLnjATY7GbyEK" type="speaker">
-            Anf. 139 ULLA-BRITT HAGSTRÖM (kd)
-          </note>
-          <note xml:id="i-7oNWThnvdVmVLw758e51GM">
-            replik:
-          </note>
+          <note xml:id="i-P6nUkeunxJLnjATY7GbyEK" type="speaker">Anf. 139 ULLA-BRITT HAGSTRÖM (kd) replik:</note>
           <u xml:id="i-2f7ccd34de4c2d98-1302" who="Q4952016" next="i-2f7ccd34de4c2d98-1303">
             <seg xml:id="i-MqRAy3w314F7imTjqjUNrt">
               Fru talman! Torgny Danielsson kan ändå inte komma ifrån att regeringen

corpus/protocols/200001/prot-200001--074.xml

Diff starting from line 2382

@@ -2442,12 +2382,7 @@
               åtgärder?
             </seg>
           </u>
-          <note xml:id="i-NvS8ZRePFKvkPPwJaNhwoZ" type="speaker">
-            Anf. 29 Statsrådet MAJ-INGER KLING-
-          </note>
-          <note xml:id="i-TsAVWNEZSS8ZBVQwb6fexn" type="speaker">
-            VALL (s):
-          </note>
+          <note xml:id="i-NvS8ZRePFKvkPPwJaNhwoZ" type="speaker">Anf. 29 StatsrådetMAJ-INGER KLINGVALL (s):</note>
           <u xml:id="i-3d21e6756d249933-252" who="Q4959140" next="i-3d21e6756d249933-253">
             <seg xml:id="i-L96EPtuNF1G7Qg3mvVEJ6j">
               Fru talman! De åtgärder som jag tänker på är naturligtvis - lite

corpus/protocols/200102/prot-200102--027.xml

Diff starting from line 3753

@@ -3778,12 +3753,7 @@
               i dag. Jag vet inte om det finns någon som vill svara på den frågan.
             </seg>
           </u>
-          <note xml:id="i-7rR6ZsdeM8Fb2jstG7c3PS" type="speaker">
-            Anf. 63 Kulturminister MARITA ULV-
-          </note>
-          <note xml:id="i-VrTHNkLme7dW2rqGxLYMB6" type="speaker">
-            SKOG (s):
-          </note>
+          <note xml:id="i-7rR6ZsdeM8Fb2jstG7c3PS" type="speaker">Anf. 63 KulturministerMARITA ULVSKOG (s):</note>
           <u xml:id="i-ad407083a2ccaf6c-404" who="Q3115681" next="i-ad407083a2ccaf6c-405">
             <seg xml:id="i-5BYtqoMfMQTd1ftUoPJYhr">
               Fru talman! Än så länge hanterar vi inte vapenfrågor och vapenembargofrågor

corpus/protocols/200102/prot-200102--037.xml

Diff starting from line 8021

@@ -8131,12 +8021,7 @@
               kontrollera det.
             </seg>
           </u>
-          <note xml:id="i-3VBLCjL97wpP5m9MeY2T8o" type="speaker">
-            Anf. 138 Vice statsminister LENA HJELM-
-          </note>
-          <note type="speaker" xml:id="i-HjAcMmM4zZudws9aF3GpHb">
-            WALLÉN (s):
-          </note>
+          <note xml:id="i-3VBLCjL97wpP5m9MeY2T8o" type="speaker">Anf. 138 Vice statsministerLENA HJELM-WALLÉN (s):</note>
           <u xml:id="i-baff156abac29f68-927" who="unknown" next="i-baff156abac29f68-928">
             <seg xml:id="i-TpST3UUjDwpvy26ZzW6rjH">
               Herr talman! Jag började med att tala om arbetsrättsliga frågor,

corpus/protocols/200102/prot-200102--045.xml

Diff starting from line 10101

@@ -10211,12 +10101,7 @@
               att kunna hantera det som verkligheten ser ut i dag för bolaget.
             </seg>
           </u>
-          <note xml:id="i-V5EUjQVs4CA1bqBB2GVRx3" type="speaker">
-            Anf. 111 NILS-GÖRAN HOLMQVIST (s)
-          </note>
-          <note xml:id="i-QQGWNfTNLygr79dzLsN88n">
-            replik:
-          </note>
+          <note xml:id="i-V5EUjQVs4CA1bqBB2GVRx3" type="speaker">Anf. 111 NILS-GÖRAN HOLMQVIST (s) replik:</note>
           <u xml:id="i-43b2f9ab71baab5b-1067" who="Q5811951" next="i-43b2f9ab71baab5b-1068">
             <seg xml:id="i-PpkKjr94gXkciDj6XpPw4s">
               Fru talman! När det gäller skattepolitiken har jag ett antal

corpus/protocols/200102/prot-200102--078.xml

Diff starting from line 10215

@@ -10405,12 +10215,7 @@
               brottsligheten, hallicken som tjänar pengar på dem och ur narkotikaträsket.
             </seg>
           </u>
-          <note xml:id="i-95TBsCZhNU84bcVKckuuaM" type="speaker">
-            Anf. 145 MORGAN JOHANSSON (s)
-          </note>
-          <note xml:id="i-U9ifDyn3ZmWsosdRyY5KeZ">
-            replik:
-          </note>
+          <note xml:id="i-95TBsCZhNU84bcVKckuuaM" type="speaker">Anf. 145 MORGAN JOHANSSON (s) replik:</note>
           <u xml:id="i-b82f32ce4c94a5c5-1082" who="Q5887217" next="i-b82f32ce4c94a5c5-1083">
             <seg xml:id="i-5h5yYWHuti39TnRFKNYhkE">
               Fru talman! Ingemar Vänerlöv har ett problem här. Det han föreslår

corpus/protocols/200102/prot-200102--111.xml

Diff starting from line 4398

@@ -4478,12 +4398,7 @@
               i Luleå i slutet av juni.
             </seg>
           </u>
-          <note xml:id="i-Gh9AJhZbjp8K7pJNod6JNC" type="speaker">
-            Anf. 68 Statsminister GÖRAN PERSSON
-          </note>
-          <note xml:id="i-DYe7rqnhgxeGNf9ido9MLD" type="speaker">
-            (s):
-          </note>
+          <note xml:id="i-Gh9AJhZbjp8K7pJNod6JNC" type="speaker">Anf. 68 Statsminister GÖRAN PERSSON (s):</note>
           <pb facs="http://data.riksdagen.se/fil/9A809D77-24EA-4AF3-811A-2AB3B1AA6535#page=42"/>
           <u xml:id="i-eb5be07a694731fa-504" who="Q53747" next="i-eb5be07a694731fa-505">
             <seg xml:id="i-7iti1mhsgSvahGHyQtZsy4">

corpus/protocols/200102/prot-200102--114.xml

Diff starting from line 3431

@@ -3466,12 +3431,7 @@
           <note xml:id="i-EL43h56qfVhZWCDgVyje8r">
             8 § Svar på interpellation 2001/02:427 om barn i fattiga familjer
           </note>
-          <note xml:id="i-V3wHC1ChtQiKGPLGdtMin4" type="speaker">
-            Anf. 42 Socialminister LARS ENGQVIST
-          </note>
-          <note xml:id="i-87gvMvS1UobcDMU7e8pfAo" type="speaker">
-            (s):
-          </note>
+          <note xml:id="i-V3wHC1ChtQiKGPLGdtMin4" type="speaker">Anf. 42 Socialminister LARS ENGQVIST (s):</note>
           <u xml:id="i-1862d41dd875f3f7-378" who="Q5854875" next="i-1862d41dd875f3f7-379">
             <seg xml:id="i-Gn7wQr8tvspFwAWTaqP4Fo">
               Fru talman! Lars Elinderson har frågat mig vad jag avser att

Diff starting from line 5029

@@ -5114,12 +5029,7 @@
               och friskvård som en samhällets uppgift på samma sätt som sjukvården?
             </seg>
           </u>
-          <note xml:id="i-4A1Dy4HBiU77PG1JJNeWpg" type="speaker">
-            Anf. 60 Socialminister LARS ENGQVIST
-          </note>
-          <note xml:id="i-4ciVdbr4bw9LiKYGLzCibB" type="speaker">
-            (s):
-          </note>
+          <note xml:id="i-4A1Dy4HBiU77PG1JJNeWpg" type="speaker">Anf. 60 Socialminister LARS ENGQVIST (s):</note>
           <u xml:id="i-1862d41dd875f3f7-560" who="Q5854875" next="i-1862d41dd875f3f7-561">
             <seg xml:id="i-WreLwGu2cq6yyV7BscU136">
               Fru talman! Jag berättade i mitt interpellationssvar att regeringen

corpus/protocols/200102/prot-200102--116.xml

Diff starting from line 3870

@@ -3915,12 +3870,7 @@
               kulturpolitiken.
             </seg>
           </u>
-          <note xml:id="i-TY4HMaq9FEN8GXEJDsyve" type="speaker">
-            Anf. 48 Kulturminister MARITA ULVSKOG
-          </note>
-          <note xml:id="i-FMUTP5qjQ5Et2sUh5hN2KZ" type="speaker">
-            (s):
-          </note>
+          <note xml:id="i-TY4HMaq9FEN8GXEJDsyve" type="speaker">Anf. 48 Kulturminister MARITA ULVSKOG (s):</note>
           <u xml:id="i-7f8043f4552fbf3d-396" who="Q3115681" next="i-7f8043f4552fbf3d-397">
             <seg xml:id="i-XX5V8qCW3KuQA8oDVZrBnB">
               Fru talman! Kulturpolitiken är nu gudskelov inte bara hänvisad

corpus/protocols/200203/prot-200203--050.xml

Diff starting from line 7442

@@ -7512,12 +7442,7 @@
             10 § Svar på interpellation 2002/03:132 om regler rörande kyrkligt
             ägd fast egendom
           </note>
-          <note xml:id="i-RKqVww7fv58er1DrxhSVGV" type="speaker">
-            Anf. 84 Justitieminister THOMAS BOD-
-          </note>
-          <note type="speaker" xml:id="i-WEmjYhbZKJECcLm9YWo6Rw">
-            STRÖM (s):
-          </note>
+          <note xml:id="i-RKqVww7fv58er1DrxhSVGV" type="speaker">Anf. 84 JustitieministerTHOMAS BODSTRÖM (s):</note>
           <u xml:id="i-fe4578ede1afc266-833" who="Q3141607" next="i-9GnQoj7bFzRdumxzYEbxyq">
             <seg xml:id="i-LtchecQff3iLACLjcYSnDo">
               Fru talman! Lena Ek har frågat mig vilka åtgärder jag avser att

corpus/protocols/200203/prot-200203--061.xml

Diff starting from line 8295

@@ -8405,12 +8295,7 @@
               ingenting har hänt. Någonting har hänt.
             </seg>
           </u>
-          <note xml:id="i-E2hfXMKbC4vfXf2Wf5cdws" type="speaker">
-            Anf. 122 CHRISTINA AXELSSON (s)
-          </note>
-          <note xml:id="i-8azk4z5FEAP2aAM1TaN5FN">
-            replik:
-          </note>
+          <note xml:id="i-E2hfXMKbC4vfXf2Wf5cdws" type="speaker">Anf. 122 CHRISTINA AXELSSON (s) replik:</note>
           <u xml:id="i-0d9b120e61664057-880" who="Q4937060" next="i-0d9b120e61664057-881">
             <seg xml:id="i-2LgxVha1NcUrGnBArZskZ7">
               Fru talman! Jag pratade personligen med SSI så sent som i går

corpus/protocols/200203/prot-200203--092.xml

Diff starting from line 2241

@@ -2246,12 +2241,7 @@
               nu.
             </seg>
           </u>
-          <note xml:id="i-G99wk9RyzASaGWz5384daG" type="speaker">
-            Anf. 22 Justitieminister THOMAS BOD-
-          </note>
-          <note type="speaker" xml:id="i-QucPeWCs63Q5pRPeaf2FoX">
-            STRÖM (s) replik:
-          </note>
+          <note xml:id="i-G99wk9RyzASaGWz5384daG" type="speaker">Anf. 22 JustitieministerTHOMAS BODSTRÖM (s) replik:</note>
           <u xml:id="i-0fc8b58bffe3b560-144" who="Q3141607">
             <seg xml:id="i-5wJXrFUK6Y1wNXoc7saVBa">
               Herr talman! Det var roligt att Beatrice Ask vaknade till. Man

corpus/protocols/200203/prot-200203--112.xml

Diff starting from line 7185

@@ -7240,12 +7185,7 @@
               göra när det gäller detta viktiga område, äldre och undernäring?
             </seg>
           </u>
-          <note xml:id="i-TxxAduEjQXG2cCCpZPDJb4" type="speaker">
-            Anf. 82 Statsrådet MORGAN JOHANS-
-          </note>
-          <note xml:id="i-SdkLsbHTdQ1jx8TpgsUn5w" type="speaker">
-            SON (s):
-          </note>
+          <note xml:id="i-TxxAduEjQXG2cCCpZPDJb4" type="speaker">Anf. 82 StatsrådetMORGAN JOHANSSON (s):</note>
           <u xml:id="i-f2637128384ec1ce-761" who="Q5887217" next="i-f2637128384ec1ce-762">
             <seg xml:id="i-Xe7mPx4fnfMECQCBA6YQbg">
               Fru talman! Jag tycker kanske ändå att Cristina
BobBorges commented 8 months ago

mp unit test fails because this branch has new metadata but redetect edits only in 50 files -- this test failing is expected and not a problem to merge into query-metadata branch

BobBorges commented 8 months ago

The classify_join_intros.py file generates some files -- some already tracked image and some newly added due to my edits of the script image Do we want to track such files? I wouldn't track them (they are generated by the script). If not, I think we should remove those that are tracked and ignore all of them.

ninpnin commented 8 months ago

@BobBorges I agree I think we don't want to track such files.

Also, the merged intros look fine, but they should be formatted on a separate line, like so

<note xml:id="i-TxxAduEjQXG2cCCpZPDJb4" type="speaker">
    Anf. 82 StatsrådetMORGAN JOHANSSON (s):
</note>

There should be a script for doing this.

ninpnin commented 8 months ago

This python snippet should format a protocol so that the text is on a separate line

from pyparlaclarin.refine import format_texts
# [...]
root = format_texts(root)
BobBorges commented 8 months ago

Sampled changes

corpus/protocols/199091/prot-199091--005.xml

Diff starting from line 3182

@@ -3251,10 +3182,7 @@
             </seg>
           </u>
           <note xml:id="i-MCos8uDQAyo7qjV5gACLbx" type="speaker">
-            Anf. 69 Civilminister BENGT K Å
-          </note>
-          <note xml:id="i-6jTckTotpCcTPS79gGGipz" type="speaker">
-            JOHANSSON:
+            Anf. 69 Civilminister BENGT K Å JOHANSSON:
           </note>
           <u xml:id="i-dd79e44da648f34f-369" who="Q5885510" next="i-dd79e44da648f34f-370">
             <seg xml:id="i-XioitYEThqFciCJ7iX1zHo">

corpus/protocols/199091/prot-199091--038.xml

Diff starting from line 967

@@ -982,10 +967,7 @@
             </seg>
           </u>
           <note xml:id="i-3ThXKDSop3Psjem5414tXE" type="speaker">
-            Anf. 25 Statsrådet LENA HJELM-WALLÉN
-          </note>
-          <note xml:id="i-RXwaK17W3WUQ2MQS1qGQXn" type="speaker">
-            (s):
+            Anf. 25 Statsrådet LENA HJELM-WALLÉN (s):
           </note>
           <u xml:id="i-bdb40db96dd6947e-95" who="Q460919" next="i-bdb40db96dd6947e-96">
             <seg xml:id="i-17tzrGamDxMgAPXrRdcVKT">

corpus/protocols/199091/prot-199091--046.xml

Diff starting from line 8773

@@ -8785,10 +8773,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-LcyZE6KxtjBGpGu2VZBMX5">
-            Anf. 54 MARGÓ INGVARDSSON (v)
-          </note>
-          <note xml:id="i-JtkJeRmFkLdVJTZSSMEpFS">
-            replik:
+            Anf. 54 MARGÓ INGVARDSSON (v) replik:
           </note>
           <u xml:id="i-bfa1343abd46653c-998" who="Q4955783" next="i-bfa1343abd46653c-999">
             <seg xml:id="i-E8WDaMP4Th7qNJJuWf6JYV">

corpus/protocols/199091/prot-199091--049.xml

Diff starting from line 3030

@@ -3102,10 +3030,7 @@
             </seg>
           </u>
           <note xml:id="i-3Kvobihh1Mzh1YgojLGmjp" type="speaker">
-            Anf. 68 Socialminister INGELA THALÉN
-          </note>
-          <note xml:id="i-H99iic5n2pKW97VbdgyxXE" type="speaker">
-            (s):
+            Anf. 68 Socialminister INGELA THALÉN (s):
           </note>
           <u xml:id="i-8de83981ccaa5eee-347" who="unknown">
             <seg xml:id="i-D1cwDAxBeZsv1P9o5fAP6K">

corpus/protocols/199091/prot-199091--061.xml

Diff starting from line 4700

@@ -4838,10 +4700,7 @@
             </seg>
           </u>
           <note xml:id="i-TkkdAEQKwEcGkDC6zBoUGB" type="speaker">
-            Anf. 123 Kommunikationsminister GEORG
-          </note>
-          <note type="speaker" xml:id="i-4qqRFAb1R2AKikDzEoS5Tg">
-            ANDERSSON (s):
+            Anf. 123 Kommunikationsminister GEORG ANDERSSON (s):
           </note>
           <u xml:id="i-bc46c100e6649aa9-524" who="Q5554537" next="i-H2UVm2yCsbFCTWJM9v8WCm">
             <seg xml:id="i-StJBeuiPjY5rcs3MwXDurz">

corpus/protocols/199091/prot-199091--082.xml

Diff starting from line 5243

@@ -5252,10 +5243,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-UDLSSsWcdbKtqtnK8ssUgs">
-            Anf. 38 INGRID HEMMINGSSON (m)
-          </note>
-          <note xml:id="i-7YBVSRL59ymVYkGFKCvoNq">
-            replik:
+            Anf. 38 INGRID HEMMINGSSON (m) replik:
           </note>
           <u xml:id="i-616ea1ec335247e8-583" who="Q4953552" next="i-616ea1ec335247e8-584">
             <seg xml:id="i-2GxjSyXUqJufCcpcbu2bQ2">

corpus/protocols/199192/prot-199192--025.xml

Diff starting from line 5489

@@ -5528,10 +5489,7 @@
             </seg>
           </u>
           <note xml:id="i-MzyJtqxxUyk9bBP53TjH68" type="speaker">
-            Anf. 74 Statsrådet REIDUNN LAURÉN
-          </note>
-          <note xml:id="i-8bPuJCyYf2VwABPT2NBMWG">
-            (--):
+            Anf. 74 Statsrådet REIDUNN LAURÉN (--):
           </note>
           <u xml:id="i-ce61a3fdfd51a9f4-652" who="Q4961261" next="i-ce61a3fdfd51a9f4-653">
             <seg xml:id="i-KXxUMShspaMGrbaFEfpH6m">

corpus/protocols/199192/prot-199192--027.xml

Diff starting from line 2776

@@ -2809,10 +2776,7 @@
             16 § Svar på fråga 1991/92:100 om pensionsåldern för yrkesofficerare
           </note>
           <note xml:id="i-Q88dauuuMz1Kc8TVthnVbz" type="speaker">
-            Anf. 72 Försvarsminister ANDERS BJÖRCK
-          </note>
-          <note xml:id="i-Mdf1Gt4uixZn4TXXUVztFX" type="speaker">
-            (m):
+            Anf. 72 Försvarsminister ANDERS BJÖRCK (m):
           </note>
           <u xml:id="i-642e2532fc5a2f10-298" who="Q490785" next="i-642e2532fc5a2f10-299">
             <seg xml:id="i-8TbW9itq7UTYzC57VF37zf">

corpus/protocols/199192/prot-199192--037.xml

Diff starting from line 4246

@@ -4408,10 +4246,7 @@
             </seg>
           </u>
           <note xml:id="i-83wjur3nauMuEEMNDpZjeT" type="speaker">
-            Anf. 112 Arbetsmarknadsminister BÖRJE
-          </note>
-          <note type="speaker" xml:id="i-L4XQ9U7BJNNJ6qXtMYua7A">
-            HÖRNLUND (c):
+            Anf. 112 Arbetsmarknadsminister BÖRJE HÖRNLUND (c):
           </note>
           <u xml:id="i-02adb08626132d2e-492" who="Q5005171" next="i-02adb08626132d2e-493">
             <seg xml:id="i-NsxhzkcPKdNtGC56d1Wjup">

corpus/protocols/199192/prot-199192--044.xml

Diff starting from line 11196

@@ -11313,10 +11196,7 @@
             </seg>
           </u>
           <note xml:id="i-Ajvq3dpb2XesEfH3GcxYEj" type="speaker">
-            Anf. 178 HOLGER GUSTAFSSON (kds)
-          </note>
-          <note xml:id="i-3zUHK7Vqwxuhb7yXrZJ7gC">
-            replik:
+            Anf. 178 HOLGER GUSTAFSSON (kds) replik:
           </note>
           <u xml:id="i-c526af2de47342e8-1265" who="Q5777750" next="i-c526af2de47342e8-1266">
             <seg xml:id="i-SitaSipX5aaC5wExR4Q3iR">

corpus/protocols/199192/prot-199192--049.xml

Diff starting from line 7118

@@ -7142,10 +7118,7 @@
             </seg>
           </u>
           <note xml:id="i-KCK4r7RLkzjUjtZwCZbCWS" type="speaker">
-            Anf. 68 Kulturminister BIRGIT FRIGGEBO
-          </note>
-          <note xml:id="i-3igg25GLary2XwBGMvrnHe" type="speaker">
-            (fp):
+            Anf. 68 Kulturminister BIRGIT FRIGGEBO (fp):
           </note>
           <u xml:id="i-cdb56f17e800ec07-822" who="Q4916267" next="i-cdb56f17e800ec07-823">
             <seg xml:id="i-EFzDnp4YtK648VEMzUK8A1">

corpus/protocols/199192/prot-199192--054.xml

Diff starting from line 2312

@@ -2333,10 +2312,7 @@
             </seg>
           </u>
           <note xml:id="i-Ag5J8svN5Mq2MvxWLcCD3U" type="speaker">
-            Anf. 42 Arbetsmarknadsminister BÖRJE
-          </note>
-          <note type="speaker" xml:id="i-VGVQR2634A7tDLePotJkYc">
-            HÖRNLUND (c):
+            Anf. 42 Arbetsmarknadsminister BÖRJE HÖRNLUND (c):
           </note>
           <u xml:id="i-6850a760cde7e2fe-247" who="Q5005171">
             <seg xml:id="i-Q3sAVbrvyfh8uc9QDwL9v8">

corpus/protocols/199293/prot-199293--025.xml

Diff starting from line 11932

@@ -12286,10 +11932,7 @@
             </seg>
           </u>
           <note xml:id="i-Xxej9XxrnvvANvt68xWvs4" type="speaker">
-            Anf. 325 Försvarsminister ANDERS BJÖRCK
-          </note>
-          <note xml:id="i-VgNQB9ANbP8s4z2SRjoqXm" type="speaker">
-            (m):
+            Anf. 325 Försvarsminister ANDERS BJÖRCK (m):
           </note>
           <u xml:id="i-491450da720c9f9c-1343" who="Q490785" next="i-491450da720c9f9c-1344">
             <seg xml:id="i-M5PjFKJ2wo7Bnj3X44S5g5">

corpus/protocols/199293/prot-199293--035.xml

Diff starting from line 5416

@@ -5497,10 +5416,7 @@
             </seg>
           </u>
           <note xml:id="i-Vkkq3Y94RCtNiP9CGYxyVB" type="speaker">
-            Anf. 124 Statsrådet REIDUNN LAURÉN (-
-          </note>
-          <note xml:id="i-DgWRUYysnMXQoMZKgVkrY4">
-            ):
+            Anf. 124 Statsrådet REIDUNN LAURÉN (-):
           </note>
           <u xml:id="i-7ebf2647590d4903-614" who="Q4961261" next="i-7ebf2647590d4903-615">
             <seg xml:id="i-KVhjZKx2YVmbK77zBTuXfN">

corpus/protocols/199293/prot-199293--096.xml

Diff starting from line 11824

@@ -11872,10 +11824,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-Cf6rbW3JRNY9XgSWyKxZ6j">
-            Anf. 126 CHRISTEL ANDERBERG (m)
-          </note>
-          <note xml:id="i-QxnoN5pVUoZBdW9vUJNnQc">
-            replik:
+            Anf. 126 CHRISTEL ANDERBERG (m) replik:
           </note>
           <u xml:id="i-7803e5b350c80f73-1385" who="Q4935587" next="i-7803e5b350c80f73-1386">
             <seg xml:id="i-D4nrHu9UPv7c411EoKzaZq">

corpus/protocols/199293/prot-199293--115.xml

Diff starting from line 2459

@@ -2501,10 +2459,7 @@
             </seg>
           </u>
           <note xml:id="i-XRsQPriSu5b44sJ65Curxe" type="speaker">
-            Anf. 57 Statsrådet REIDUNN LAURÉN
-          </note>
-          <note xml:id="i-QSkDJMx2Rc2dNbHHdau2BX">
-            (-):
+            Anf. 57 Statsrådet REIDUNN LAURÉN (-):
           </note>
           <u xml:id="i-cd9d93bbf176642f-264" who="Q4961261" next="i-cd9d93bbf176642f-265">
             <seg xml:id="i-GbbquBRKhDiSwwc4cVVr6Q">

corpus/protocols/199394/prot-199394--024.xml

Diff starting from line 2907

@@ -2991,10 +2907,7 @@
             14 § Svar på fråga 1993/94:131 om de funktionshindrade och lönebidragen
           </note>
           <note xml:id="i-7wWFM7qbXVqh8VmWpsP14B" type="speaker">
-            Anf. 60 Arbetsmarknadsminister BÖRJE
-          </note>
-          <note type="speaker" xml:id="i-Kzckj3kktb7227LwdJSyxc">
-            HÖRNLUND (c):
+            Anf. 60 Arbetsmarknadsminister BÖRJE HÖRNLUND (c):
           </note>
           <u xml:id="i-464604d7a9edea4d-333" who="Q5005171" next="i-464604d7a9edea4d-334">
             <seg xml:id="i-PoZP9RAWBo6L3rqJKfzTqR">

Diff starting from line 444

@@ -453,10 +444,7 @@
             Turkiet m.m.
           </note>
           <note xml:id="i-D8sShZrzYB8bcDb3JSxEqD" type="speaker">
-            Anf. 7 Utrikesminister MARGARETHA AF
-          </note>
-          <note type="speaker" xml:id="i-Q7ZgoeqtN7zTbaaPFQncGA">
-            UGGLAS (m):
+            Anf. 7 Utrikesminister MARGARETHA AF UGGLAS (m):
           </note>
           <u xml:id="i-464604d7a9edea4d-41" who="Q455820" next="i-464604d7a9edea4d-42">
             <seg xml:id="i-Dbx9fKvZcdNEMnnWt4Z3cM">

corpus/protocols/199394/prot-199394--046.xml

Diff starting from line 1859

@@ -1874,10 +1859,7 @@
             (Applåder)
           </note>
           <note xml:id="i-L7N8xSPXUY2KqM9ksuN4KH" type="speaker">
-            Anf. 16 IAN WACHTMEISTER (nyd)
-          </note>
-          <note xml:id="i-HwsnrNhPKdF1yjdUe4K2J3">
-            replik:
+            Anf. 16 IAN WACHTMEISTER (nyd) replik:
           </note>
           <u xml:id="i-1b935445cfa07fb5-237" who="Q5983177" next="i-1b935445cfa07fb5-238">
             <seg xml:id="i-HQnvULjcfnhNpDZdfsep8C">

corpus/protocols/199394/prot-199394--059.xml

Diff starting from line 3085

@@ -3115,10 +3085,7 @@
             </seg>
           </u>
           <note xml:id="i-4xaL9GVHNk1HgoMfqLHReC" type="speaker">
-            ANf. 35 MARGARETA WINBERG (s)
-          </note>
-          <note xml:id="i-LSm5J1wJqgNRf6453YzcWd">
-            replik:
+            ANf. 35 MARGARETA WINBERG (s) replik:
           </note>
           <u xml:id="i-1f62ca39fc15af58-399" who="Q3430022" next="i-1f62ca39fc15af58-400">
             <seg xml:id="i-PvSzYZF4cHAgaRC1dfw5RT">

corpus/protocols/199394/prot-199394--073.xml

Diff starting from line 9690

@@ -9717,10 +9690,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-P1rZMFrrcZ5UzZrBMyPnrV">
-            Anf. 100 IAN WACHTMEISTER (nyd)
-          </note>
-          <note xml:id="i-Dy4GjNMJnZq3zMAKLYoZT2">
-            replik:
+            Anf. 100 IAN WACHTMEISTER (nyd) replik:
           </note>
           <u xml:id="i-854eb82c676d5b32-1026" who="Q5983177" next="i-854eb82c676d5b32-1027">
             <seg xml:id="i-PpeL1oSzQBhWqWzFiLjnFX">

corpus/protocols/199394/prot-199394--086.xml

Diff starting from line 5355

@@ -5490,10 +5355,7 @@
             </seg>
           </u>
           <note xml:id="i-X9LrvQHh5cQP249aQ2sZDC" type="speaker">
-            Anf. 119 Utbildningsminister PER UNCKEL
-          </note>
-          <note xml:id="i-3erHixQe3SkyUSUj2g7bkJ" type="speaker">
-            (m):
+            Anf. 119 Utbildningsminister PER UNCKEL (m):
           </note>
           <u xml:id="i-1a1ccafdd155ad66-610" who="Q1830351" next="i-1a1ccafdd155ad66-611">
             <seg xml:id="i-VsiL1SspYjx4rKNAmaNHRW">

corpus/protocols/199394/prot-199394--114.xml

Diff starting from line 2832

@@ -2913,10 +2832,7 @@
             10 § Svar på frågorna 1993/94:530 och 562 om förvaring av kärnbränsleavfall
           </note>
           <note xml:id="i-W6q3NR9rF9vhAGMRRWgp7E" type="speaker">
-            Anf. 61 Miljöminister OLOF JOHANSSON
-          </note>
-          <note xml:id="i-FQvq62qJ8TsAEcU2NXauQh" type="speaker">
-            (c):
+            Anf. 61 Miljöminister OLOF JOHANSSON (c):
           </note>
           <u xml:id="i-a54247adccbbcccc-326" who="Q2021126" next="i-a54247adccbbcccc-327">
             <seg xml:id="i-5HGeHZ9PZ8sigyGJKQ9Rmg">

corpus/protocols/199394/prot-199394--120.xml

Diff starting from line 10849

@@ -10900,10 +10849,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-R8YM3t6AL7yNXv21qx2yQR">
-            Anf. 91 LENNART HEDQUIST (m)
-          </note>
-          <note xml:id="i-7PLhpjDbgxhgWeBqEsE36J">
-            replik:
+            Anf. 91 LENNART HEDQUIST (m) replik:
           </note>
           <u xml:id="i-436916f7ef21a445-1189" who="Q5796375" next="i-436916f7ef21a445-1190">
             <seg xml:id="i-8L1HRgZxVwL94RUUsFtfKU">

corpus/protocols/199495/prot-199495--024.xml

Diff starting from line 4514

@@ -4517,10 +4514,7 @@
             </seg>
           </u>
           <note xml:id="i-VJXKgiyeBKXVQZvQCYdijV" type="speaker">
-            Anf. 64 Näringsminister STEN HECKSCHER
-          </note>
-          <note xml:id="i-6ZDfREyPYFhsCM4CFTobef" type="speaker">
-            (s):
+            Anf. 64 Näringsminister STEN HECKSCHER (s):
           </note>
           <pb facs="http://data.riksdagen.se/fil/8782DFC3-2BA2-4F70-AB58-DDCBBE5C9398#page=44"/>
           <u xml:id="i-8fed722b0750662f-513" who="Q4126210" next="i-8fed722b0750662f-514">

corpus/protocols/199495/prot-199495--046.xml

Diff starting from line 18898

@@ -18952,10 +18898,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-3B6YiWcHpXksLtxunu5v96">
-            Anf. 166 MICHAEL STJERNSTRÖM (kds)
-          </note>
-          <note xml:id="i-P6Ys8gLfGY3rzy3PmBQ2m4">
-            replik:
+            Anf. 166 MICHAEL STJERNSTRÖM (kds) replik:
           </note>
           <u xml:id="i-c6ffa3b287edec46-1913" who="Q6190751" next="i-c6ffa3b287edec46-1914">
             <seg xml:id="i-V5vSUZoDPGuBHM2coAD174">

corpus/protocols/199495/prot-199495--070.xml

Diff starting from line 2732

@@ -2750,10 +2732,7 @@
             </seg>
           </u>
           <note xml:id="i-7YGzfLA8hPaFrWSZHksH43" type="speaker">
-            Anf. 23 Utrikesminister LENA HJELM-
-          </note>
-          <note xml:id="i-HisNUioMEBoTDNLgiVtj4E" type="speaker">
-            WALLÉN (s)
+            Anf. 23 Utrikesminister LENA HJELM-WALLÉN (s)
           </note>
           <u xml:id="i-22d503e0a0859781-296" who="unknown" next="i-22d503e0a0859781-297">
             <seg xml:id="i-vM9PiyVeS6AGwFcXJ7NzX">

corpus/protocols/199495/prot-199495--082.xml

Diff starting from line 4284

@@ -4287,10 +4284,7 @@
             </seg>
           </u>
           <note xml:id="i-QND7WBNVmczbAM4NdhfX1v" type="speaker">
-            Anf. 46 Statsminister INGVAR
-          </note>
-          <note xml:id="i-ubEEXfbbf1j8E9G6d8w92" type="speaker">
-            CARLSSON (s)
+            Anf. 46 Statsminister INGVAR CARLSSON (s)
           </note>
           <u xml:id="i-519b2b123837f0d8-393" who="Q53740" next="i-519b2b123837f0d8-394">
             <seg xml:id="i-Hk58ELB4GtwmvEkDt4NuBF">

corpus/protocols/199495/prot-199495--111.xml

Diff starting from line 718

@@ -748,10 +718,7 @@
             4 § Svar på fråga 1994/95:515 om farliga vägstolpar
           </note>
           <note xml:id="i-HhhGrGg2o3y3SVKm9A4sRq" type="speaker">
-            Anf. 19 Kommunikationsminister INES
-          </note>
-          <note xml:id="i-W8yCwgSKHB7NsooavsmufM" type="speaker">
-            UUSMANN (s)
+            Anf. 19 Kommunikationsminister INES UUSMANN (s)
           </note>
           <u xml:id="i-a93c4ea845db2ec3-62" who="Q4984124" next="i-a93c4ea845db2ec3-63">
             <seg xml:id="i-C7LeRxHiGQH16zn7AFJwsU">

corpus/protocols/199495/prot-199495--113.xml

Diff starting from line 7337

@@ -7442,10 +7337,7 @@
             </seg>
           </u>
           <note xml:id="i-7UZ95RTYPgYnuixCQjc8Eq" type="speaker">
-            Anf. 97 Utrikesminister LENA HJELM-
-          </note>
-          <note xml:id="i-DCW1KugutH2EZE1ULBWwH7" type="speaker">
-            WALLÉN (s)
+            Anf. 97 Utrikesminister LENA HJELM-WALLÉN (s)
           </note>
           <pb facs="http://data.riksdagen.se/fil/E69C8960-86E0-4028-A871-BFE394D6D225#page=71"/>
           <u xml:id="i-099555037568b315-740" who="unknown" next="i-099555037568b315-741">

corpus/protocols/199495/prot-199495--115.xml

Diff starting from line 17722

@@ -17833,10 +17722,7 @@
             </seg>
           </u>
           <note xml:id="i-KSonqF2xizGVkXcJfvLKEd" type="speaker">
-            Anf. 186 BRITT-MARIE DANESTIG-
-          </note>
-          <note xml:id="i-SiStejWTQZbb7u1sgVRgmc" type="speaker">
-            OLOFSSON (v) replik
+            Anf. 186 BRITT-MARIE DANESTIG-OLOFSSON (v) replik
           </note>
           <u xml:id="i-6ae134e993875446-1764" who="unknown" next="i-6ae134e993875446-1765">
             <seg xml:id="i-Y4GAUaGiiDeiQppyrKARmk">

corpus/protocols/199596/prot-199596--027.xml

Diff starting from line 12413

@@ -12428,10 +12413,7 @@
             </seg>
           </u>
           <note xml:id="i-FwZb43G696mjNAUHxh1C8d" type="speaker">
-            Anf. 169 BRITT-MARIE DANESTIG-
-          </note>
-          <note xml:id="i-2Ei1pAuFkL8VsA1mR7sxRB" type="speaker">
-            OLOFSSON (v) replik
+            Anf. 169 BRITT-MARIE DANESTIG-OLOFSSON (v) replik
           </note>
           <u xml:id="i-6ae12adf3bf71d7c-1245" who="unknown" next="i-6ae12adf3bf71d7c-1246">
             <seg xml:id="i-2NtNtscShx9hC9X5eFLEBJ">

corpus/protocols/199596/prot-199596--115.xml

Diff starting from line 3890

@@ -3896,10 +3890,7 @@
             </seg>
           </u>
           <note xml:id="i-LCDPJMQ5njuy8bNmzSEFDt" type="speaker">
-            Anf. 53 Finansminister ERIK ÅSBRINK (s)
-          </note>
-          <note xml:id="i-UZ4r6h4rJKv58VVo9sr4pS">
-            replik
+            Anf. 53 Finansminister ERIK ÅSBRINK (s) replik
           </note>
           <u xml:id="i-f9ccb2b908c984cb-445" who="Q5388933" next="i-f9ccb2b908c984cb-446">
             <seg xml:id="i-4VPid76rKTUEMSnJ1h7che">

corpus/protocols/199798/prot-199798--029.xml

Diff starting from line 3175

@@ -3187,10 +3175,7 @@
             31 om studiemedel för studier utomlands
           </note>
           <note xml:id="i-CNFSXK1bNWLhcbzkoyBP1v" type="speaker">
-            Anf. 39 Utbildningsminister CARL
-          </note>
-          <note xml:id="i-MYBFbYP7aZGUnnXBU7za8r" type="speaker">
-            THAM (s):
+            Anf. 39 Utbildningsminister CARL THAM (s):
           </note>
           <u xml:id="i-5ca5387a9130b39d-319" who="Q6206776" next="i-5ca5387a9130b39d-320">
             <seg xml:id="i-NdyxrGc1Pd8PGFVCi1MLMM">

corpus/protocols/199798/prot-199798--046.xml

Diff starting from line 12176

@@ -12233,10 +12176,7 @@
             </seg>
           </u>
           <note xml:id="i-E19ry8Cs2eH2999Z3vXv7v" type="speaker">
-            Anf. 135 BRITT-MARIE DANESTIG (v)
-          </note>
-          <note xml:id="i-43CYuZMNU5rmLfWg1Bzpi3">
-            replik:
+            Anf. 135 BRITT-MARIE DANESTIG (v) replik:
           </note>
           <u xml:id="i-855ce8045999c393-1308" who="Q4944157" next="i-855ce8045999c393-1309">
             <seg xml:id="i-AHCNkefF5GygqrkwvxSyMU">

corpus/protocols/199798/prot-199798--107.xml

Diff starting from line 15615

@@ -15642,10 +15615,7 @@
             </seg>
           </u>
           <note type="speaker" xml:id="i-CRYB9Rj16mU7utacC3Mcup">
-            Anf. 196 MARIANNE ANDERSSON (c)
-          </note>
-          <note xml:id="i-3Zus61w4svaVDNTW4ag1Wa">
-            replik:
+            Anf. 196 MARIANNE ANDERSSON (c) replik:
           </note>
           <pb facs="http://data.riksdagen.se/fil/5B39D5AF-5369-476E-82EC-90A94641BF34#page=147"/>
           <u xml:id="i-a314fe48c3b4efa7-1560" who="Q4935892">

corpus/protocols/199899/prot-199899--101.xml

Diff starting from line 2962

@@ -2965,10 +2962,7 @@
             </seg>
           </u>
           <note xml:id="i-69RghLHY26jCXwhVxegDm4" type="speaker">
-            Anf. 28 Finansminister BOSSE RING-
-          </note>
-          <note xml:id="i-PaferMJif9P9DpxvfNKefd" type="speaker">
-            HOLM (s):
+            Anf. 28 Finansminister BOSSE RINGHOLM (s):
           </note>
           <u xml:id="i-fd1c663eb7b07f04-293" who="Q321595" next="i-fd1c663eb7b07f04-294">
             <seg xml:id="i-THNsXzdQWGGTjbaJ74NgmJ">

corpus/protocols/19992000/prot-19992000--025.xml

Diff starting from line 4336

@@ -4366,10 +4336,7 @@
             </seg>
           </u>
           <note xml:id="i-Pe4hLBz76ifSKYTLKST5f1" type="speaker">
-            Anf. 69 Statsminister GÖRAN PERS-
-          </note>
-          <note xml:id="i-9MRoRtcp8vnJaWn7iuoqHH" type="speaker">
-            SON (s):
+            Anf. 69 Statsminister GÖRAN PERSSON (s):
           </note>
           <u xml:id="i-c0739e3cd802cabb-452" who="Q53747" next="i-c0739e3cd802cabb-453">
             <seg xml:id="i-TeQUaBF6zL8b7mrqMLkK9u">

corpus/protocols/19992000/prot-19992000--044.xml

Diff starting from line 9854

@@ -9854,10 +9854,7 @@
             </seg>
           </u>
           <note xml:id="i-DakAjDB5rRxXUqw1RUJyYc" type="speaker">
-            Anf. 123 ESTER LINDSTEDT-STAAF (kd)
-          </note>
-          <note xml:id="i-9CrWMMj98v6T8jaYoh5NXR">
-            replik:
+            Anf. 123 ESTER LINDSTEDT-STAAF (kd) replik:
           </note>
           <u xml:id="i-42e01a4d28b9d50d-1008" who="Q4962937" next="i-42e01a4d28b9d50d-1009">
             <seg xml:id="i-E1B2cdSPzoAUr4W27H1YkD">

corpus/protocols/200001/prot-200001--035.xml

Diff starting from line 6335

@@ -6350,10 +6335,7 @@
             och jämställdhet
           </note>
           <note xml:id="i-Hbbwsh2m281UDz9wWToysS" type="speaker">
-            Anf. 81 Näringsminister BJÖRN ROSEN-
-          </note>
-          <note xml:id="i-FjBoSb6qHAzbmr7ZTw28xG" type="speaker">
-            GREN (s):
+            Anf. 81 Näringsminister BJÖRN ROSENGREN (s):
           </note>
           <u xml:id="i-1fc242e94e785fc3-683" who="Q3374466" next="i-1fc242e94e785fc3-685">
             <seg xml:id="i-BcEDsC5K9ZGKPhmbuhnaXJ">

corpus/protocols/200001/prot-200001--042.xml

Diff starting from line 16149

@@ -16320,10 +16149,7 @@
             </seg>
           </u>
           <note xml:id="i-71eQh71kMRQcHwFKjkx6r5" type="speaker">
-            Anf. 237 ULLA-BRITT HAGSTRÖM (kd)
-          </note>
-          <note xml:id="i-7XikfvrUwddcETTcsydkre">
-            replik:
+            Anf. 237 ULLA-BRITT HAGSTRÖM (kd) replik:
           </note>
           <u xml:id="i-03786a9fea02e888-1751" who="Q4952016" next="i-03786a9fea02e888-1752">
             <seg xml:id="i-SmpPrZ27U1kXw5vBwNB3pD">

corpus/protocols/200001/prot-200001--074.xml

Diff starting from line 9980

@@ -10073,10 +9980,7 @@
             organisationer
           </note>
           <note xml:id="i-7oB9L7ea5H6YQqxdQuBZ9Z" type="speaker">
-            Anf. 125 Finansminister BOSSE RING-
-          </note>
-          <note xml:id="i-Mt4hZb1LpxyQCWpaeTQCBA" type="speaker">
-            HOLM (s):
+            Anf. 125 Finansminister BOSSE RINGHOLM (s):
           </note>
           <pb facs="http://data.riksdagen.se/fil/1E52728F-24A1-4D91-A3F0-826D56D73AB3#page=97"/>
           <u xml:id="i-3d21e6756d249933-1039" who="Q321595" next="i-3d21e6756d249933-1040">

corpus/protocols/200001/prot-200001--110.xml

Diff starting from line 3358

@@ -3406,10 +3358,7 @@
             </seg>
           </u>
           <note xml:id="i-oecEapAnwBEZxC2gAnxdx" type="speaker">
-            Anf. 39 Näringsminister BJÖRN ROSEN-
-          </note>
-          <note xml:id="i-4FZw1QgqZfTNjRqjimFzBr" type="speaker">
-            GREN (s):
+            Anf. 39 Näringsminister BJÖRN ROSENGREN (s):
           </note>
           <pb facs="http://data.riksdagen.se/fil/F258E0BF-B765-4461-ACD3-85F0AD65D04E#page=30"/>
           <u xml:id="i-10df0e1581037889-367" who="Q3374466" next="i-10df0e1581037889-368">

corpus/protocols/200001/prot-200001--116.xml

Diff starting from line 10288

@@ -10381,10 +10288,7 @@
             </seg>
           </u>
           <note xml:id="i-L6P1msgfhh1cRjhB9rGc5g" type="speaker">
-            Anf. 116 Finansminister BOSSE RING-
-          </note>
-          <note xml:id="i-9jMSbrEnkxErzKjxspu3XQ" type="speaker">
-            HOLM (s):
+            Anf. 116 Finansminister BOSSE RINGHOLM (s):
           </note>
           <u xml:id="i-3a62f0c3eb7f6e30-1115" who="Q321595" next="i-3a62f0c3eb7f6e30-1116">
             <seg xml:id="i-VGbhJjGNNsj5w4e6pVPK8r">

corpus/protocols/200102/prot-200102--081.xml

Diff starting from line 16755

@@ -16857,10 +16755,7 @@
             </seg>
           </u>
           <note xml:id="i-HcPE4u34TF8vW3UxSW7q8G" type="speaker">
-            Anf. 205 Jordbruksminister MARGARETA
-          </note>
-          <note type="speaker" xml:id="i-9AHjWkzVpPdxRCuaN4FT3m">
-            WINBERG (s) replik:
+            Anf. 205 Jordbruksminister MARGARETA WINBERG (s) replik:
           </note>
           <u xml:id="i-59bf090eedaa9225-1702" who="unknown" next="i-59bf090eedaa9225-1703">
             <seg xml:id="i-A5dsXcsbcG4DxiqwZMXVv5">

corpus/protocols/200102/prot-200102--103.xml

Diff starting from line 2220

@@ -2244,10 +2220,7 @@
             </seg>
           </u>
           <note xml:id="i-3XR7dtrt2pZjsKqjhP4Kkd" type="speaker">
-            Anf. 25 RUNAR PATRIKSSON (fp)
-          </note>
-          <note xml:id="i-WhM1tBkJKUqJakf45JgMDr">
-            replik:
+            Anf. 25 RUNAR PATRIKSSON (fp) replik:
           </note>
           <u xml:id="i-dfa73534e1f7095a-239" who="Q6037205" next="i-dfa73534e1f7095a-240">
             <seg xml:id="i-9ceTRwZ5ys3gGH38bejdec">

corpus/protocols/200102/prot-200102--120.xml

Diff starting from line 16363

@@ -16402,10 +16363,7 @@
             </seg>
           </u>
           <note xml:id="i-2xZfQfZqRM8EfjabwW1xdt" type="speaker">
-            Anf. 165 Statsrådet INGELA THALÉN (s)
-          </note>
-          <note xml:id="i-DDw3cBFn7zoj2ZXMauvJcn">
-            replik:
+            Anf. 165 Statsrådet INGELA THALÉN (s) replik:
           </note>
           <u xml:id="i-92fb7d85905d7a7a-1731" who="Q4982419" next="i-92fb7d85905d7a7a-1732">
             <seg xml:id="i-2brHxVCPB3rNHKqoBcqHv2">

corpus/protocols/200203/prot-200203--109.xml

Diff starting from line 4390

@@ -4411,10 +4390,7 @@
             </seg>
           </u>
           <note xml:id="i-McPu1hSjWJ4WFnQfzUoLVg" type="speaker">
-            Anf. 63 Jordbruksminister ANN-CHRISTIN
-          </note>
-          <note type="speaker" xml:id="i-GAX6bciNbbBnyux7yakhDc">
-            NYKVIST (s):
+            Anf. 63 Jordbruksminister ANN-CHRISTIN NYKVIST (s):
           </note>
           <u xml:id="i-cca020918ee0d32e-454" who="Q547580" next="i-cca020918ee0d32e-455">
             <seg xml:id="i-CEa8gHpKzNmj5jwrVVB2uM">

corpus/protocols/200203/prot-200203--111.xml

Diff starting from line 515

@@ -527,10 +515,7 @@
             5 § Svar på interpellation 2002/03:391 om hantverkare inom kulturmiljöområdet
           </note>
           <note xml:id="i-Fa8ADW5JS3WBim2vWkFCL8" type="speaker">
-            Anf. 8 Kulturminister MARITA ULVS-
-          </note>
-          <note xml:id="i-9SCgXCXPyVhe6FPiQfva49" type="speaker">
-            KOG (s):
+            Anf. 8 Kulturminister MARITA ULVSKOG (s):
           </note>
           <u xml:id="i-366f00f18dffb575-46" who="Q3115681" next="i-366f00f18dffb575-47">
             <seg xml:id="i-LZAkjkcxgJUomJQWACvhyN">

corpus/protocols/200203/prot-200203--116.xml

Diff starting from line 12728

@@ -12812,10 +12728,7 @@
             </seg>
           </u>
           <note xml:id="i-NRvqWxmpvfeKfaUY3nMqBk" type="speaker">
-            Anf. 176 CATHARINA BRÅKENHI-
-          </note>
-          <note xml:id="i-Hj98bTjQTUmk7VyDuYCCgq" type="speaker">
-            ELM (s):
+            Anf. 176 CATHARINA BRÅKENHIELM (s):
           </note>
           <u xml:id="i-1a9ea5d8a15f1190-1400" who="Q3363818" next="i-1a9ea5d8a15f1190-1401">
             <seg xml:id="i-LYqfCEYmX28bz81R63tCFa">
ninpnin commented 8 months ago

load_metadata should be refactored not to use input/segmentation/join_intros.csv though.

BobBorges commented 8 months ago

load_metadata should be refactored not to use input/segmentation/join_intros.csv though.

@ninpnin Then we also have to remove some things from detect_mps() in pyriksdagen.refine() -- so it seems like redetect was calculating this intro merging on the fly in that function. After we changed the file names, it couldn't handle it anymore and there's the reason for our drop in quality. @MansMeg

BobBorges commented 8 months ago

The sample is OK, lets merge this to the query-metadata branch, and I will fix these issues with load_metadata and detect_mps() in that branch -- I'll actually be running things there. We know the unit test will fail here and it's ok/expected.

MansMeg commented 8 months ago

All correct? I think we should merge this into dev right away instead. I think we should have that as the principle. Otherwise it will become a mess with a risk that we need to check these edits again when the query branch creates a PR to dev.

If you merge this with dev, then you can merge dev into the query branch. It gives the same results - but we keep the process simpler.

I.e. only do sample qc to dev. What do you think?

BobBorges commented 8 months ago

don't merge do dev -- everything will fail

BobBorges commented 8 months ago

this branch has new metadata,but the redetect hasn't been correctly applied

BobBorges commented 8 months ago

merge to the query-metadata branch, we redetect until we're satisfied, then that one to dev

BobBorges commented 8 months ago

The query branch pr sample will only show edits to the who attrib.

MansMeg commented 8 months ago

Ok! Then we need to handle the edits we now check when merging to dev.

BobBorges commented 8 months ago

These edits are in a commit. When we merge to the query branch and then redetect, the sample to merge that branch into dev will only have edits to the who attribs.