Watts-Lab / commonsense-platform

Commonsense platform
https://commonsense.seas.upenn.edu
1 stars 0 forks source link

Language support for statements #146

Open markwhiting opened 3 weeks ago

markwhiting commented 3 weeks ago

As pointed out in #145, language support involves a few main components:

  1. setting up language support on the experiment website
  2. setting up language support for the statements including back end and analysis
  3. getting translations of content

This issue focuses on 2 and 3 (for the statements)

This is finished when statements can be served in the UI language and we have appropriate tracking of language and meaning on the back end, so that statement languages are noted, but also, statements across languages are able to be grouped i.e., by using a common statement ID and unique language IDs, or something like that. Further we should initially support any languages that we support in #145, and we can extend the automations and tests introduced in that issue.

markwhiting commented 1 week ago

As mentioned today, it's possible that statement translation should actually happen in the statements repo. But things like surveys and back end language support should happen in this repo.

dankim444 commented 1 day ago

Statements are able to be rendered in the UI language. I created a new table that includes the first 20 rows of the original statements table and populated it with corresponding translations in the 10 languages. I then created an endpoint called api/experiment/test and called it in Layout.tsx to test that the statements were showing up properly. I've included some scripts in the server folder to help with setting up the table in your local machine.

Here's what the UI looks like when the language is in Bengali:

image
dankim444 commented 18 hours ago

duplicate_statements.zip These are all the duplicate statements I was able to find in the commonsense-statements repo that are preventing me from merging. I realized this is mostly caused by pairs of statements that are semantically very similar in English and differ only by a couple of words, resulting in a failure to capture the nuance in meaning in the translations' respective languages.

@amirrr @markwhiting Any suggestions on how I should approach addressing this?