opengovsg / FormSG

Form builder for the Singapore Government
https://form.gov.sg
Other
275 stars 84 forks source link

[React] Add i18n support for form fields #1906

Closed karrui closed 1 year ago

karrui commented 3 years ago

Should not be blocked by our components if built correctly, since i18n frameworks provide a text string which is then passed into our components

karrui commented 3 years ago

Placeholder issue for all things i18n

frankchn commented 2 years ago

After thinking through this a bit, I think we should we add on special markdown tags that renders to CSS class names, or lang attributes as part of FormSG's support of markdown.

Specifically, I am thinking of something along the lines of one of these options outlined in the answers to this StackOverflow question about adding attributes/classes to Markdown.

After Markdown renders the content with the language attribute tags, there can be a separate pass to figure out if any language tags are added, and then show specific buttons at the top of the form to toggle between the languages within the field.

If there is any content with a missing language, we can also display a warning in preview mode, and show the all the remaining languages when the content is in production.

This is probably easier than changing any schema to accommodate multiple languages, and probably has the advantage of being able to be implemented purely on the client-side.

CC @karrui

karrui commented 2 years ago

We currently use react-markdown with all its plugins to render markdown, it is probably possible to do what you are suggesting!

Example of our Banner component

EDIT: That requires the raw HTML variant of markdown, which I am slightly wary of doing; hmm... EDIT2: Nevertheless, we probably can use rehype-slug plugin to inject some id (perhaps <fieldId>-<lang>) into the markdown that we can use to filter the language! EDIT3: or even rehype-attr,remark-directive

frankchn commented 2 years ago

We can also repurpose an existing block that we don't think people will use (or when with special language identifiers) like so: https://codesandbox.io/s/react-markdown-playground-forked-lvn18?file=/src/App.js

We can talk through this a bit more tomorrow too.

frankchn commented 2 years ago

The other stuff is good too, but I wonder if it might be a bit syntax heavy for users haha

frankchn commented 2 years ago

After syncing, I will take a look at making the codesandbox better (instead of relying on janky hax) -- i.e. maybe write a parser extension or even generating the syntax with rehype-attr or remark-directive if we are generating Markdown instead of asking users to write it themselves.

A few other things to think about:

1) User interface for entering multiple languages 2) What if a form creator misses a language out in certain fields or instructions, etc... 3) How do we detect if this feature is enabled on a form

frankchn commented 2 years ago

Thinking about this some more, I suspect language detection is probably the better idea still. This would also not clutter up any interfaces.

We can just require line breaks between languages and just optionally show a language filter on forms which enabled it. If we don't detect a language reliably (string is too short), it will show up on all languages just in case.

Here is a prototype: https://codesandbox.io/s/react-markdown-playground-forked-nc1cv?file=/src/App.jsx

karrui commented 2 years ago

We can always add a global public form context that allows us to change the i18n selected and propagate it through the form, similar to how many websites in the EU operate Screenshot 2022-02-14 at 11 58 25 AM

frankchn commented 2 years ago

Throwing out another idea for the backend here. What if we store each text (e.g. each option) using something like XML.

For instance, for each option in a checkbox, we will store something like <multilanguage><en>Singapore Citizens</en><zh>新加坡公民</zh></multilanguage>, <multilanguage><en>Permanent Residents</en><zh>永久居民</zh></multilanguage>, etc...

In the frontend when we display these, we can check to see whether the <multilanguage> tag exists by running it through a DOMParser, and if it does, we will extract the strings for each language.

const parser = new DOMParser()
const doc = parser.parseFromString("<multilanguage><en>Singapore Citizens</en><zh>新加坡公民</zh></multilanguage>", "text/xml")

if (!doc.querySelector("multilanguage")) {
  // string does not have multi-language support
}

const en = doc.querySelector("en").textContent
const zh = doc.querySelector("zh").textContent

Similarly, we can also consistently construct the content with the DOM APIs

const mle = document.createElement("multilanguage")
const en = document.createElement("en")
en.textContent = "Permanent Residents"
const zh = document.createElement("zh")
zh.textContent = "永久居民"
mle.append(en, zh)

const s = new XMLSerializer()
const serializedOption = s.serializeToString(mle)

This also interacts well with any markdown processing that might happen, since this would be done before any markdown processing happens.

We can also easily check whether all the languages for the form are present on each field easily, and alert the form author in the interface itself if it is not.

frankchn commented 2 years ago

So I've played around with remark-directive and here is some syntax I've come up with:

:::language{lang=en}
Permanent **Residents**
:::

:::language{lang=zh}
永久**居民**
:::

This would make it marginally harder to parse out the language tags to display them properly in an editor, but with a bit of regex it won't be too bad.

Here is a demo with a sample remark-directive: https://codesandbox.io/s/custom-markdown-rendering-forked-hqxvti

@karrui Let me know what you think :)

karrui commented 2 years ago

This seems perfect; and allows us to continue to store text as a string (safely-ish) in the database. Do you think this is a better alternative to xml?

From your demo, we can also explore rendering our own components whilst looking at the node with our useMdComponents hook

@frankchn

frankchn commented 2 years ago

Ah the useMdComponents thing might be cool. Can you point me to where the user's markdown view of the Forms are rendered right now? (i.e. where do you call the markdown renderer :p).

karrui commented 2 years ago

Right here for form labels, which is where we expect the usage.

karrui commented 2 years ago

https://markdoc.io/ @frankchn something cool from stripe!

wanlingt commented 1 year ago

Migrated to Linear FRM-1402