UniversalDependencies / tools

Various utilities for processing the data.
GNU General Public License v2.0
203 stars 43 forks source link

Validation for the "Copula is not AUX" constraint #16

Closed prokopidis closed 7 years ago

prokopidis commented 7 years ago

The "Copula is not AUX" section of the syntax validation page (http://universaldependencies.org/svalidation.html#copula-is-not-aux) reports that UD_Greek has 437 hits.

Yet, when one clicks the "Go to search" link http://bionlp-www.utu.fi/dep_search/query?search=%28%21%28AUX%7CPRON%29%29+%3Ccop+_&db=UD_Greek-dev, all sentences contain AUX-tagged copulas (with typical verb features like VerbForm). Is this a bug?

Thanks.

fginter commented 7 years ago

Thanks for the report. I'm looking into this.

jnivre commented 7 years ago

Looks like what happened temporarily for Swedish, when the search looked at XPOSTAG instead of UPOSTAG.

daghaug commented 7 years ago

There is something wrong with this query in UD_Latin-PROIEL too. It reports 3468 hits, but when I click "Go to search", it says "No matches found".

fginter commented 7 years ago

I know or think I know what's wrong. Will be ready soon.

fginter commented 7 years ago

Hi. Cached compiled queries are the culprit. I wiped them. I think this works now.

daghaug commented 7 years ago

I get the same behaviour on UD_Latin-PROIEL and UD_Ancient_Greek-PROIEL. Thousands of hits on the overview page, but when I open the query, no matches are found.

fginter commented 7 years ago

Immediate cause:

Root cause:

Fixed now and I will rerun svalidation.html next. These kinds of weird errors should now (finally) disappear. This has been haunting me for weeks.

fginter commented 7 years ago

Just to make sure: regenerating the whole of svalidation.html does take an hour or so... ...running now...