Tatoeba / tatoeba2

Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
https://tatoeba.org
GNU Affero General Public License v3.0
713 stars 132 forks source link

Follow up on jiru’s offer in order to prevent hijacking the other thread... #2684

Open mramosch opened 3 years ago

mramosch commented 3 years ago

@jiru: While I do think we need to solve this issue, I’d rather start implementing a solution that answers an actual real-world use case instead of just thinking about "some third-party application" that "might need" this or that. Otherwise I am afraid we are going to implement something that works but will be little use in practice. In other words, start simple and don’t think too big.

Well, I’ll most definitely take my chances on this offer ;-)

——————————————————— SETTING THE SCENE: ———————————————————

My third-party application is (amongst other filtering options like language grouping etc.) also able to filter the list of translations shown on a sentence page (respectively search results page) by the skill-levels of the users in the language of their contributed translations. It also displays that value along with every translation. So only translations of a certain ‘quality-level’ will be displayed. Furthermore any combination of the five language groups A-E can be switched on/off independently.

In addition the App has an opaque three tier evaluation facility that extends that basic ‘tatoeba-user-self-identified value (0-5)’ indicator for every translation with ‘curation levels (6-9 and eventually fully curated)’ - where every translation relies on a confirmation-of-correctness by several natives and where the App forces each user, who wants to contribute to this extended evaluation system, to specify their native language in advance. This can only be done once and not be changed afterwards (except manually by an admin server-side).

The correctness-confirmation-checkbox only appears along with source sentences in the users’ native language but only if there aren’t sufficient votes yet for a 100% curation level and the user has permission to evaluate (tier 1/2/3).

——————————————————— THE TIER STATUS MECHANISM: ———————————————————

Everybody can vote from the get go!

It takes up to 11 native evaluations per sentence (depending on the opaque trusted-status of the ‘evaluators’), the minimum is 5 evaluations from users with the ‘trusted tier 3’ status.

‘Trusted tier 3’ users (equivalent to Corpus Maintainers) have the additional opportunity to open up a list that contains only sentences that have already passed the lower tiers, in order to focus their valuable effort on bringing already sufficiently evaluated sentences as quickly as possible to a fully 100% curation level. Of course, if any other sentence is evaluated by a ‘trusted tier 3’ user, it will immediately be promoted to a tier 3 curation level but would still need four more ‘trusted tier 3’ votes to reach the ‘fully curated’ status.

To get a sentence to ‘tier 3’ it takes • either 6 ‘standard tier 1’ votes • or 3 ‘advanced tier 2’ votes (they simply count as 2 votes) • or any combination that sums up to 6 votes

——————————————————— USER INTERFACE: ———————————————————

For the source sentence on a sentence page you can see a fine-grained 0-100% evaluation indicator, for every translation you will see its coarse current status (the above mentioned simple number (0-9 or fully curated)).

The tier system is totally opaque to the contributors. A visible checkbox ‘asks’ for their help in the evaluation process, no checkbox simply means ‘Thank You! Everything you could have possible done to help is done.’

Misuse can lead to silent-blocking from this participation (no more checkboxes will appear ;-)

——————————————————— HOW TATOEBA COULD HELP: ———————————————————

• For the most simple case:

What the App would need in order to only have to maintain/download the 300 kilobyte user-language-skill file, is one additional field for every translation in the JSON object of the sentence page HTTP response-HTML, containing the user name for the respective translation (just like the user-name field for the source sentence).

• For the simple case:

In order to completely avoid all that Download-File and synchronization business, also only one additional field for every translation and the source sentence would be required in the JSON object of the sentence page HTTP response-HTML, containing the owner’s skill-level for the language of the respective sentence/translation.

• For the optimal case:

The App could POST all internally collected extended-sentence-evaluation data to the Tatoeba server so maybe one day you guys can take advantage of those data-points too and reflect them in your interface as well.

In that case TRUTH is always kept on the server and the additional fields in the HTTP response-HTML’s JSON object would reflect the owner’s ‘extended’ skill-level for the language of the respective source sentence/translation.

———————————————————

I am not sure whether JSON key:value pairs are always typed as strings, but in case they allow for other types, a single signed Int8 for the extended-skill-level value would be enough to return all the information - i.e. one additional byte per translation!!!

If a JASON object is always a string, well then it would be max. 4 characters for the sign (+/-) and 3 digits (0-127)...

——————————————————— CONCLUSION: ———————————————————

In my case I wouldn’t have to sync different data-sources, all necessary information would come in at the same time and in one single JSON object, without having to hit the server with additional requests to gather all data needed.

If I don’t want to maintain an internal ‘duplicate’ of the Tatoeba DB, I had to hit the server with a new request for every single translation of a sentence, just to get to the translation’s owner userID and then still have to correlate the (potentially dozens of) responses with the users’ language-skill-level file...

——————————————————— SCREENSHOT: ———————————————————

IMG_2908

jiru commented 3 years ago

Thank you for the detailed explanation! :+1:

I have a few questions.

Your app reminds me of the "review" feature of Tatoeba. This feature needs to be enabled in the settings. Are you aware of its existence?

What’s the goal of your curation system? Who uses it for what purposes?

How are evaluators accounts of your app related to tatoeba.org accounts? Does your app uses its own userlist separated from tatoeba.org? Do I need an account on tatoeba.org to join your curation effort?

If one of the evaluators finds a mistake in a sentence that could be easily fixed by editing, will they just mark the sentence as incorrect without leaving a comment?

I understand that the evaluators evaluate sentences, but what about translations? For example, if two linked sentences are perfectly correct in their respective language, but have different meaning (so they should be unlinked), what are the evaluators supposed to do?

I am not sure whether JASON key:value pairs

Are you talking about JSON?

mramosch commented 3 years ago

Your app reminds me of the "review" feature of Tatoeba. This feature needs to be enabled in the settings. Are you aware of its existence?

Of course I am ;-)

———————————————————

What’s the goal of your curation system? Who uses it for what purposes?

As you can see in the screenshot, the user can assign any language to one group (A-E) and can then filter the viewport with any combination of groups. Group A is always your confirmed native language and the other groups could be

• your working languages (you have a lot of knowledge of) for tasks like ‘linking’ or ‘translating’ • the languages you study but shouldn’t be visible when working on ‘linking’ or ‘translating’ • the languages that you are interested in but you don’t necessarily wanna see all the time • etc...

The ‘...’ button next to the A-E buttons is for the remaining languages that are not assigned to any group.

There are 2 sets of filter-settings you can switch back and forth between with the SHIFT button (arrow-up) in the middle of the toolbar. A double tap on the SHIFT button brings you to a third advanced set that will always show all translations, but will move the languages from activated groups to the top of the list.

Now you can jump between 3 completely different presets with the help of one SHIFT button only...

This will be even more useful with the multi-linking feature, where you can link more than 2 languages at the same time on one sentence page and all link-combinations (pairs) will be done automatically for all participating languages without having to go to all the respective translations’ sentence pages and having to remember yourself what you linked, why and where? Here e.g. activating your working languages group only will prevent you from accidentally hitting wrong translations...

The same goes for the buttons 1-5, N, Curated etc.! You can filter the list of translations to see e.g.

• only the ‘good’ ones from Tatoeba (maybe 4-5) -> for practicing (in case you trust the self-evaluation of the sentence owner) • only the lower rated ones -> for boosting their curation level (which essentially is a fine-grained automated ‘needs-native-check’ tag) • only the ‘advanced’ curated ones • only the ‘fully’ curated ones that were acknowledged by at least 5-11 confirmed natives.

Native speakers of minority languages (languages that do not have a good visibility) can add translations into a language they know ‘quite’ well, knowing that native speakers of these targeted languages have an explicit facility to easily filter out and display only ‘sub-optimal’ sentences (maybe 1-3 or even 1-4) and do good deeds when they are in the mood of correcting such contributions.

———————————————————

How are evaluators accounts of your app related to tatoeba.org accounts?

Well, that depends on the level of cooperation I will receive from Tatoeba, if any at all.

Everything non-related to Tatoeba can be used without having to log-in to Tatoeba although straight from the onboarding view users are highly encouraged to create a Tatoeba account.

• For the most simple case: • For the simple case:

I will maintain a completely separate user-base,

• For the optimal case:

some data-fields would be required on the Tatoeba server (in exchange for providing the collected evaluation data to Taoeba as it were ;-)

———————————————————

Does your app use its own userlist separated from tatoeba.org? Do I need an account on tatoeba.org to join your curation effort?

Essentially NO for the evaluation, but you can only participate with at least an in-app account, because you HAVE to specify your native language in advance that has to be checked by some authority.

For an easier correspondence I am thinking of automatically taking the name of logged in Tatoeba users as the user name in the app, but that obviously requires the user to be logged in to Tatoeba beforehand.

But that really all depends on whether I have to correlate all the Tatoeba skills-data myself or not. From the moment the first in-app native speaker casts a ‘correctness’ evaluation, the Tatoeba skill level is history anyways and never again displayed anymore. So I could make an initial data-base from the download files the reflects the state at this time and then drive the traffic from the in-app users towards curating the newest contributions on Tatoeba, not caring about a hand full of sentence owners updating their language skills profile after I took the snapshot.

Sentences by Tatoeba users (users, that the ‘trusted-crew’ has explicitly checked out for their real-life native status) will automatically count as in-app trusted curator’s sentences, so they will in any case start off at the first curation level from the get go. The more information we get about real native users the more sentences will start already above Tatoeba territory....

———————————————————

If one of the evaluators finds a mistake in a sentence that could be easily fixed by editing, will they just mark the sentence as incorrect without leaving a comment?

Never! You can only confirm ‘The correctness’ of a sentence (with a single checkbox). Neither can you downvote nor vote with a random value.

That is why I was so keen on getting all this cookie-stuff from our other conversation straight, because without working csrf tokens and hashes I would have been forced to e.g. get rid of the comment box on every sentence page, and my main goal is, that the app user feels like being on the Tatoeba site without loosing any existing functionality whatsoever. So commenting is strongly encouraged in order to get as soon as possible to a state where a user feels comfortable to check the correctness-box...

———————————————————

I understand that the evaluators evaluate sentences, but what about translations? For example, if two linked sentences are perfectly correct in their respective language, but have different meaning (so they should be unlinked), what are the evaluators supposed to do?

They do whatever they are already doing on the website, they unlink if they see fit, and in addition they vote if they see fit. Nothing really changes! The only difference being that in order to reach full curation status it takes up to 11 correctness confirmations and only real natives can cast them...

In addition, the ‘base’ translation for the source sentence (the linked translation, that the source sentence was originally translated from) is color coded, in case it was really translated and not just linked afterwards. So the user doesn’t even have to bother, to trace back the inheritance hierarchy via the protocol logs in the right pane.

You only see checkboxes next to the source sentences, which have the main focus including a fine grained percentage indicator next to the checkbox. Translations only show the short value (1-9, unspecified or fully curated). For evaluating a translation you have to click the respective info button in order to go on its own sentence page where you can cast your vote.

———————————————————

I am not sure whether JASON key:value pairs

Are you talking about JSON?

Yes, just like a few lines further down ;-)

EDIT: I corrected it 😇

Essentially it’s just a technicality whether only ONE single additional byte (Int8) gets added to the JSON payload or a couple of bytes (for encoding the Int8 into a string). Just to show how insignificant the increase in size would be for the amount of value I could get out of this information...

——————————————————— SCREENSHOT: ———————————————————

IMG_2600

jiru commented 3 years ago

What’s the goal of your curation system? Who uses it for what purposes?

As you can see in the screenshot, [...]

Sorry but this doesn’t really answer my question. You are describing me what can be done. I am asking who uses this and why. As far as I understand, you are selecting your own subset of the corpus based on your own curation system with your own team of evaluators. Why are you doing this? (Note that I am not questioning your project, just trying to understand it.) For example, as an existing contributor of Tatoeba, why would I use your app instead of my browser?