Closed arildm closed 5 months ago
Discussed with Martin. Looks like the []{0,}
should only be added if none of the reduced attributes is positional. If any of them is positional, the subquery will contain the right amount of tokens, and adding <match> ... </match>
is sufficient.
I previously "fixed" #289 by adding a
[]*
to the subquery, but it seems to have added new problems.This query for the NPEGL mode https://spraakbanken.gu.se/korplabb/?mode=npegl#?cqp=%3Cnp%3E%20%5B%5D%7B0,10%7D%20%5Bword%20%3D%20%22b%C3%A6%C3%B0i%22%5D%20%5B%5D%7B0,10%7D%20%3C%2Fnp%3E&corpus=npegl-ice&search_tab=1&within=text&show_stats&result_tab=2&search=cqp shows a statistics row with 6 hits for "bæði", but clicking it yields 14 hits, namely all that begin with "bæði".
~An example from an "ordinary" corpus is here https://spraakbanken.gu.se/korplabb/#?cqp=%5Bpos%20%3D%20%22MID%7CMAD%7CPAD%22%5D%20%5B%5D%7B0,1%7D%20%3C%2Fsentence%3E&corpus=attasidor&search_tab=1&show_stats&result_tab=2&search=cqp where the value ")" is reported 8 times, but the link shows 17 hits, including some ") ."~ (This example not needed now that NPEGL is public)