Closed karrui closed 1 year ago
Placeholder issue for all things i18n
After thinking through this a bit, I think we should we add on special markdown tags that renders to CSS class names, or lang
attributes as part of FormSG's support of markdown.
Specifically, I am thinking of something along the lines of one of these options outlined in the answers to this StackOverflow question about adding attributes/classes to Markdown.
After Markdown renders the content with the language attribute tags, there can be a separate pass to figure out if any language tags are added, and then show specific buttons at the top of the form to toggle between the languages within the field.
If there is any content with a missing language, we can also display a warning in preview mode, and show the all the remaining languages when the content is in production.
This is probably easier than changing any schema to accommodate multiple languages, and probably has the advantage of being able to be implemented purely on the client-side.
CC @karrui
We currently use react-markdown
with all its plugins to render markdown, it is probably possible to do what you are suggesting!
Example of our Banner component
EDIT: That requires the raw HTML variant of markdown, which I am slightly wary of doing; hmm...
EDIT2: Nevertheless, we probably can use rehype-slug
plugin to inject some id (perhaps <fieldId>-<lang>
) into the markdown that we can use to filter the language!
EDIT3: or even rehype-attr
,remark-directive
We can also repurpose an existing block that we don't think people will use (or when with special language identifiers) like so: https://codesandbox.io/s/react-markdown-playground-forked-lvn18?file=/src/App.js
We can talk through this a bit more tomorrow too.
The other stuff is good too, but I wonder if it might be a bit syntax heavy for users haha
After syncing, I will take a look at making the codesandbox better (instead of relying on janky hax) -- i.e. maybe write a parser extension or even generating the syntax with rehype-attr or remark-directive if we are generating Markdown instead of asking users to write it themselves.
A few other things to think about:
1) User interface for entering multiple languages 2) What if a form creator misses a language out in certain fields or instructions, etc... 3) How do we detect if this feature is enabled on a form
Thinking about this some more, I suspect language detection is probably the better idea still. This would also not clutter up any interfaces.
We can just require line breaks between languages and just optionally show a language filter on forms which enabled it. If we don't detect a language reliably (string is too short), it will show up on all languages just in case.
Here is a prototype: https://codesandbox.io/s/react-markdown-playground-forked-nc1cv?file=/src/App.jsx
We can always add a global public form context that allows us to change the i18n selected and propagate it through the form, similar to how many websites in the EU operate
Throwing out another idea for the backend here. What if we store each text (e.g. each option) using something like XML.
For instance, for each option in a checkbox, we will store something like <multilanguage><en>Singapore Citizens</en><zh>新加坡公民</zh></multilanguage>
, <multilanguage><en>Permanent Residents</en><zh>永久居民</zh></multilanguage>
, etc...
In the frontend when we display these, we can check to see whether the <multilanguage>
tag exists by running it through a DOMParser
, and if it does, we will extract the strings for each language.
const parser = new DOMParser()
const doc = parser.parseFromString("<multilanguage><en>Singapore Citizens</en><zh>新加坡公民</zh></multilanguage>", "text/xml")
if (!doc.querySelector("multilanguage")) {
// string does not have multi-language support
}
const en = doc.querySelector("en").textContent
const zh = doc.querySelector("zh").textContent
Similarly, we can also consistently construct the content with the DOM APIs
const mle = document.createElement("multilanguage")
const en = document.createElement("en")
en.textContent = "Permanent Residents"
const zh = document.createElement("zh")
zh.textContent = "永久居民"
mle.append(en, zh)
const s = new XMLSerializer()
const serializedOption = s.serializeToString(mle)
This also interacts well with any markdown processing that might happen, since this would be done before any markdown processing happens.
We can also easily check whether all the languages for the form are present on each field easily, and alert the form author in the interface itself if it is not.
So I've played around with remark-directive
and here is some syntax I've come up with:
:::language{lang=en}
Permanent **Residents**
:::
:::language{lang=zh}
永久**居民**
:::
This would make it marginally harder to parse out the language tags to display them properly in an editor, but with a bit of regex it won't be too bad.
Here is a demo with a sample remark-directive: https://codesandbox.io/s/custom-markdown-rendering-forked-hqxvti
@karrui Let me know what you think :)
This seems perfect; and allows us to continue to store text as a string (safely-ish) in the database. Do you think this is a better alternative to xml?
From your demo, we can also explore rendering our own components whilst looking at the node with our useMdComponents
hook
@frankchn
Ah the useMdComponents thing might be cool. Can you point me to where the user's markdown view of the Forms are rendered right now? (i.e. where do you call the markdown renderer :p).
Right here for form labels, which is where we expect the usage.
https://markdoc.io/ @frankchn something cool from stripe!
Migrated to Linear FRM-1402
Should not be blocked by our components if built correctly, since i18n frameworks provide a text string which is then passed into our components