OpenRefine / OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it
https://openrefine.org/
BSD 3-Clause "New" or "Revised" License
10.89k stars 1.95k forks source link

Make the Wikidata reconciliation language changeable from the UI #2333

Open VladimirAlexiev opened 4 years ago

VladimirAlexiev commented 4 years ago

OR has a "language" option when creating a project. It has maybe 20 languages. Bulgarian is missing.

When I work on Bulgarian data, it's essential that WD recon brings back Bulgarian labels.

So please add another field "Reconciliation language" and allow ANY lang tag to be entered there, not just the few languages in which the OR UI is available.

wetneb commented 4 years ago

At the moment this can already be done by adding a reconciliation service manually, with the following URL: https://tools.wmflabs.org/openrefine-wikidata/bg/api Where bg is the Wikimedia language code for Bulgarian.

However this is not easy to discover from the UI so there could indeed be a dedicated UI for this. We already have a select widget for all Wikimedia languages in the Wikidata extension, so that could potentially be reused.

Lydiaofficial commented 1 year ago

However this is not easy to discover from the UI so there could indeed be a dedicated UI for this. We already have a select widget for all Wikimedia languages in the Wikidata extension, so that could potentially be reused.

I'm curious about this as I'm trying to find the select widget mentioned but no idea how to or where to .

wetneb commented 1 year ago

It is in the Wikibase schema tab: for instance, when adding a label, description or alias to an item in the Wikibase schema, you will be prompted for a language there. This is an auto-complete enabled field for a language code.

wetneb commented 1 year ago

Over at the W3C community group that is working on a new version of the protocol, we added support for language selection in multiple ways:

This means that as a user, you could potentially configure reconciliation language in multiple places. For instance, say you are reconciling a dataset of Swiss towns. You speak French, but your dataset has two columns: one for the name of the city in German and another one for the name of the city in Italian. You could be able to indicate to the reconciliation service the language of both of the columns you are passing and ask to have the results returned in French, so that you can review them more easily.

That's a summary of what the next version of the protocol will support: it does not mean that OpenRefine has to be as flexible as the protocol.

Lydiaofficial commented 12 months ago

@wetneb What does this mean for me as regards the aforementioned task to propose a design for this issue. Am I to go ahead or pause?

wetneb commented 12 months ago

My message above was meant to provide you with some information about what sort of language selection will be available in the upcoming version of the protocol, so that this can inform your design.

Lydiaofficial commented 12 months ago

Thanks for the information. Could you please provide a link to the WCE discussion page?

On Mon, Nov 13, 2023, 7:27 PM Antonin Delpeuch @.***> wrote:

My message above was meant to provide you with some information about what sort of language selection will be available in the upcoming version of the protocol, so that this can inform your design.

— Reply to this email directly, view it on GitHub https://github.com/OpenRefine/OpenRefine/issues/2333#issuecomment-1808758606, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHNELXYKOLNXZNILO3YYX3YEJRADAVCNFSM4K4E24J2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBQHA3TKOBWGA3A . You are receiving this because you were assigned.Message ID: @.***>

wetneb commented 12 months ago

The discussion has been happening in those places:

Lydiaofficial commented 11 months ago

image image image

In the mockups above, I tried to go with what's currently obtainable in the language settings of OpenRefine but with the ability for users to input preferred language tags for languages not available in OR by adding a "set language option to the reconciliation dialog.

This can also go in the language settings but I personaly think it makes more sense for it to exist within the reconciliation dialog.

wetneb commented 11 months ago

That makes sense to me! For this to be implemented and be functional it will require adaptations in the services too, so we have a bit of a chicken and egg problem (services won't implement it before we offer it in the UI, and our UI won't be functional until services implement the feature)

tfmorris commented 8 months ago

so we have a bit of a chicken and egg problem (services won't implement it before we offer it in the UI, and our UI won't be functional until services implement the feature)

Since this issue was originally about Wikidata, I think it makes sense to partner with the developers of the reconciliation service to get both the front end and back end implemented, tested, and released in parallel.