newrelic / docs-website

Source code for @newrelic docs. We welcome pull requests and questions on our docs!
https://docs.newrelic.com
Other
172 stars 1.24k forks source link

[Machine Translation] Modify script to add files to be translated to the queue #2536

Closed moonlight-komorebi closed 2 years ago

moonlight-komorebi commented 3 years ago

Summary

We will need to update the scripts for enabling Machine Translation (MT). The way we will be interacting with the Smartling API is identical to how we do it now, we just need some tweaks to the existing scripts along with some additional functionality.

<!> We should test changes to the scripts using the Test RDS Environment updated here #1538

Project Logic

In add-files-to-translation-queue.js we will need to pick up both project_id's as environment variables (these will eventually be read up from the updated workflow).

The relevant project_id should be added to the new column in the translations table. In the workflow both MACHINE_TRANSLATION_PROJECT_ID and HUMAN_TRANSLATION_PROJECT_ID will be set. The logic below will help determine which ID should be entered in to the tables:

>a value or set of values for the translate frontmatter field indicates which languages are sent for human translation. the difference between the set of all languages we support, and the set in the translate frontmatter, is what we request for machine translation.

For example, in the mdx frontmatter of a file, if we have...

    translate
      - jp

...and the defined set of languages we support is [jp, kr], then:

🛑 Testing Scripts You can use the MT Project ID for this but only test with a 1 word change, [see here](https://github.com/newrelic/docs-website/blob/develop/scripts/actions/translation_workflow/testing/README.md#make-a-change-to-translate) The reason for this is we have a `2 million` word limit per year specifically for Machine Translation >Average is about 850 words per document. total is about 1.6 million. For MT it’s a total of 2 million words can be translated over a year. we have approximately 1800 pages x 850 (avg word count per page) = 1.6 million

Acceptance Criteria

Useful links

Relates to: #2341

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be automatically closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically closed because it was a stale issue that had no recent activity. Thank you for your contributions.

jmiraNR commented 3 years ago

Hi. Regarding this point: making sure protected text is ignored from translation. implementing placeholder directives if our current process doesn't work. Clarification: the text to be protect is within a sentence. text like "New Relic", or "Add more data" when referring to a button: Please click Add more data. On these pages:

there is some information about the directives. I tested PLACEHOLDER_FORMAT_CUSTOM on an HTML to protect some random text and it worked. I was able to put some of our keywords between {} and it was not translated. I'd appreciate discussing this topic in more details when we start to work on it as I want to make sure we can automate/simplify this process as much as possible. Thx

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be automatically closed if no further activity occurs. Thank you for your contributions.

jpvajda commented 2 years ago

We want to spend more time on specifying approach and consider breaking this up into seperate tickets.

jpvajda commented 2 years ago

We discussed breaking this ticket up differently...

jpvajda commented 2 years ago

@rudouglas

moonlight-komorebi commented 2 years ago

i think we should break this up, just for clarity and organizing the complexity and different pieces. We estimated this at 5-8, i dont think splitting the ticket changes the overall effort, but is just to have a better mental map of what needs to get done and how things relate.

This is just my opinion, so don't feel obligated:

ticket 1 - add translations workflow

ticket 2 - send translations workflow