xbmc / xbmc

Kodi is an award-winning free and open source home theater/media center software and entertainment hub for digital media. With its beautiful interface and powerful skinning engine, it's available for Android, BSD, Linux, macOS, iOS, tvOS and Windows.
https://kodi.tv/
Other
18.35k stars 6.29k forks source link

Kodi api VideoLibrary.GetMovies: problem with case insensitive search for Umlaut chracters #18039

Open GregorHerten opened 4 years ago

GregorHerten commented 4 years ago

Bug report

Describe the bug

Normally the Kodi api VideoLibrary.GetMovies method allows to filter for a movies for example using the title. This filter is case insensitive. But this is not true if the filter string contains an Umlaut, e.g. Äquator. Using the filter string äquator does not find the movie with title Äquator.

Expected Behavior

One would expected that for filtering a function like string.lower() is used which converts all appearance of Ä to ä. This behaviour is indeed observed for other non-special characters.

Actual Behavior

Possible Fix

To Reproduce

Steps to reproduce the behavior:

Debuglog

The debuglog can be found here:

Screenshots

Here are some links or screenshots to help explain the problem:

Additional context or screenshots (if appropriate)

Here is some additional context or explanation that might help:

Your Environment

Used Operating system:

note: Once the issue is made we require you to update it with new information or Kodi versions should that be required. Team Kodi will consider your problem report however, we will not make any promises the problem will be solved.

DaVukovic commented 4 years ago

I can confirm this issue. Tried the following:

curl -s -u username:password -X POST http://127.0.0.1:8080/jsonrpc -H 'Content-Type: application/json' --data '{"jsonrpc": "2.0", "method": "VideoLibrary.GetMovies", "params": {"filter": {"operator": "contains", "field": "title", "value": "Taschengeld"}}, "id":1}' and curl -s -u username:password -X POST http://127.0.0.1:8080/jsonrpc -H 'Content-Type: application/json' --data '{"jsonrpc": "2.0", "method": "VideoLibrary.GetMovies", "params": {"filter": {"operator": "contains", "field": "title", "value": "taschengeld"}}, "id":1}'

which both get me:

{
  "id": 1,
  "jsonrpc": "2.0",
  "result": {
    "limits": {
      "end": 1,
      "start": 0,
      "total": 1
    },
    "movies": [
      {
        "label": "The Babysitters – Für Taschengeld mache ich alles",
        "movieid": 794
      }
    ]
  }
}

Where it doesn't matter if there's an lowercase "t" or an uppercase "T" for the filter value.

While trying:

curl -s -u username:password -X POST http://127.0.0.1:8080/jsonrpc -H 'Content-Type: application/json' --data '{"jsonrpc": "2.0", "method": "VideoLibrary.GetMovies", "params": {"filter": {"operator": "contains", "field": "title", "value": "Ü"}}, "id":1}'

gets me:

{
  "id": 1,
  "jsonrpc": "2.0",
  "result": {
    "limits": {
      "end": 1,
      "start": 0,
      "total": 1
    },
    "movies": [
      {
        "label": "Riddick - Überleben ist seine Rache",
        "movieid": 900
      }
    ]
  }
}

where: curl -s -u username:password -X POST http://127.0.0.1:8080/jsonrpc -H 'Content-Type: application/json' --data '{"jsonrpc": "2.0", "method": "VideoLibrary.GetMovies", "params": {"filter": {"operator": "contains", "field": "title", "value": "ü"}}, "id":1}'

gets me a bunch of movies which do contain a lowercase "ü" but not the "Riddick"-one which contains an uppercase "Ü"

KirstenZa commented 3 years ago

Of course, this problem is not limited to VideoLibrary.GetMovies. The problem also exists for AudioLibrary.GetSongs, AudioLibrary.GetArtists etc.

Seems to be a general problem with JSON-RPC and UTF-8?

Because of that, searching in Kodi (Global Search) is currently not usable for me if special characters are used. And German titles contain special characters pretty often.

DaveTBlake commented 3 years ago

Seems to be a general problem with JSON-RPC and UTF-8?

No, it is a limitation in the underlaying SQL and collation. SQLite only understands upper/lower case for ASCII characters by default, we would need to override the default implementation of the LIKE functions to handle accent characters.

github-actions[bot] commented 2 months ago

This issue is now marked stale because it has been open over a year without activity. Remove the stale label or add a comment to reset the stale state.