spraakbanken / korp-frontend

Frontend for Korp, a tool using the IMS Open Corpus Workbench (CWB).
https://spraakbanken.gu.se/en/tools/korp
MIT License
16 stars 8 forks source link

Nicer display of "transformer neighbor", a ranked set attribute from KB-BERT #340

Closed arildm closed 6 months ago

arildm commented 7 months ago

Markus wrote:

Man skulle vilja hantera den här analysen på samma sätt som alla andra rankade analyser med tillhörande värden, exempelvis WSD:n (betydelse).

För att ge ett konkret exempel, så försökte jag plocka fram prediktionen för första ordet i meningen: https://spraakbanken.gu.se/korp/#?cqp= []&corpus=suc3&stats_reduce=transformer-neighbour&show_stats&search_tab=1&search=cqp&result_tab=2

Vilket jag inser nu typ funkar, eftersom det verkar alltid vara samma värde för ett predicerat token, men det ger en rörig bild. Jämför med betydelser: https://spraakbanken.gu.se/korp/#?cqp=%3Csentence%3E%20%5B%5D&corpus=suc3&stats_reduce=sense&show_stats&search_tab=1&result_tab=2&search=cqp

This goes for the statistics table as well as the KWIC sidebar.

Perhaps related to:

arildm commented 6 months ago

For the sidebar, this addition to the attribute config should be enough @MartinHammarstedt:

"sidebar_component": "expandList",

The display and row-merging in the statistics table looks like it's hard-coded in statistics_config.js, I'll look into that.

arildm commented 6 months ago

I fixed some stuff. Now we can use this instead of the comment above @MartinHammarstedt:

"sidebar_component": {
  "name": "expandList",
  "options": {
    "internal_search": true,
    "op": "highest_rank"
  }
},
arildm commented 6 months ago

Looks like it works as expected now. Deployed to korplabb atm.

Aside from the corpus-config edit mentioned above, I did these::

And in the korp-frontend-sb repo:

In Markus' example link, the generated sub-queries are too long (when clicking a word in the statistics table) for the top words, so there is an error. But scroll down to "Dessutom" with 264 hits and it works fine. Created a follow-up issue for that here: