openeventdata / UniversalPetrarch

Language-agnostic political event coding using universal dependencies
MIT License
18 stars 9 forks source link

Weird hardcoded 010 event #22

Closed ahalterman closed 6 years ago

ahalterman commented 6 years ago

@khaledJabr found a weird hardcoded 010 event coding in UniversalPetrarch that is also in Petrarch2. If we're reading it right, it assigns events a 010 code if some event info is missing.

Here's the code in UniversalPetrarch and here it is in Petrarch2. Could this explain the explosion in 010 events once we switched to Petrarch2? Anyone know what's going on here? cc @philip-schrodt

philip-schrodt commented 6 years ago

FWIW, it's my bad and got added into PETR-2 when I was frantically trying to do a comparison of PETR-1 and PETR-2 using the Gigaword corpus for a paper I presented at PRIO three years ago, and note that it is tactfully commented as "a very crude hack" In the original context, it only applied it some rare situations involving some cases that invoked multi-word verbs (in English) but seems to be getting called more frequently now, probably through some sort of more general failure in the pattern matching.

That said, I should have embedded the problem in a try/except and done nothing when the exception was hit rather than doing this.

Good to know the code controlling your nearby nuclear reactor probably contains similar little tweaks. In MOSTech 6502 assembly language. As does Facebook's code for protecting your private information, except it's written in Java by the shop in Russia that submitted the lowest bid.

philip-schrodt commented 6 years ago

Resolved in commit db1baef