teamdigitale / dati-semantic-backend

Backend for the NDC semantic repository
GNU Affero General Public License v3.0
4 stars 4 forks source link

Multilingual search indexing limitation: inability to retrieve Assets in languages other than Italian #91

Open FrankMaverick opened 6 months ago

FrankMaverick commented 6 months ago

Issue

Currently, indexing is limited to the Italian language only. This limitation excludes keyword-based searches in other languages such as English. For instance, searching for words like "learning" or "role" doesn't yield any results. However, there are assets such as "Ontologia della Formazione" or "Ontologia dei Ruoli" that contain these words, both in Italian and English.

Expected Behavior

The search functionality should be able to index and retrieve assets regardless of the language they are in. Users should be able to search for keywords in any supported language and get relevant results.

Current Behavior

The search functionality is limited to indexing assets in the Italian language only, which restricts the retrieval of assets containing keywords in other languages.

Steps to Reproduce

  1. Attempt to search for keywords in languages other than Italian. E.g. "learning" or "role"
  2. Notice the lack of results or incomplete results due to language limitations.

Possible Solutions

Context

This limitation affects the usability of the platform for users who work with assets in languages other than Italian. It also hampers the effectiveness of the search functionality, as it cannot retrieve assets containing keywords in languages other than Italian.

ndc-dxc commented 6 months ago

@Clou-dia the software actually acts like this: it indexes label@it if present, otherwise it indexes label@en. If both are missing it indexes the default. (i.e. the one with no @countryCode)