Open arildm opened 2 months ago
The backend /loglike
response doesn't distinguish a multi-word value from multiple tokens. Compare these calls:
"han"+verb vs. "hon"+verb by sense: Space in string separates tokens
{ "loglike": {
"hon..1:-1.000 vara..1:-1.000": 2375.04,
"han..1:-1.000 vara..1:-1.000": -1774.16,
"hon..1:-1.000 skola..4:-1.000": 1062.87,
// ...
"frihet" vs. "jämlikhet" by party: Space in string does not separate tokens
{ "loglike": {
"Feministiskt initiativ": 78.12,
"V\u00e4nsterpartiet": 74.7,
"Moderaterna": -73.75,
// ...
Perhaps we can interpret the string value as one or more tokens depending on the input queries (set1_cqp
and set2_cqp
)? But changing the response format would probably be a more robust approach.
This is where the string in the reponse is whitespace-separated: https://github.com/spraakbanken/korp-frontend/blob/38534b82a7902cc5e56a67844485505ffe0f767e/app/scripts/services/backend.ts#L125
vivill
) corpus selected, save two searches for comparisonApparently, the API request has
cqp2=[_.text_party_name = "Folkpartiet"] [_.text_party_name = "liberalerna"]