newrelic / docs-website

Source code for @newrelic docs. We welcome pull requests and questions on our docs!
https://docs.newrelic.com
Other
172 stars 1.24k forks source link

[Machine Translation] Create project_id column in Test Postgres tables #2538

Closed moonlight-komorebi closed 2 years ago

moonlight-komorebi commented 3 years ago

Background

<!> This should all be done in the testing database environment. Once this is working together with the updated scripts, we have a separate ticket to actually create those changes in the production database.

Follow the instructions in the Readme to set up your testing environment. This sets up local docker containers for a Postgres Database and PGAdmin instance to manipulate the DB.

<!> Please use the https://github.com/newrelic/docs-website/tree/feature/machine-translation feature branch

In order to reduce code duplication and effort, we will be adding a new column called project_id to a couple of existing tables. This is so we don't need separate tables for each type of translation, and can re-use the same scripts with a little tweaking.

Edit Tables

Adding project_id to the following tables is necessary for us to know whether a page needs to be sent to Machine or Human translation project in Smartling, and also identify how it's been translated when we download it.

We can do that in the PGAdmin console via the Query Tool:

⚠️ SPOILER - Click to see image! Screenshot 2021-11-05 at 10 44 39
⚠️ SPOILER - Click to see SQL Code! ```sql ALTER TABLE IF EXISTS public.jobs ADD COLUMN project_id TEXT; ALTER TABLE IF EXISTS public.translations ADD COLUMN project_id TEXT; ```

Edit Models

We will also need to alter the Models files for both jobs.js and translations.js to make sure they reflect the new table structure:

⚠️ SPOILER - Click to see Code! ```js project_id: { type: DataTypes.INTEGER, allowNull: false, references: { model: Projects(sequelize, DataTypes), key: 'projects', deferrable: Deferrable.INITIALLY_IMMEDIATE, }, }, ```

Edit Test Creation Code

We will also want to alter the query code for the testing docker container in creation_and_cleanup.sql, adding the following to the CREATE TABLE translations() and CREATE TABLE jobs() functions:

⚠️ SPOILER - Click to see Code! - `project_id TEXT`

Testing Existing Scripts

🛑 You can use the MT Project ID for this but only test with a 1 word change, see here

The reason for this is we have a 2 million word limit per year specifically for Machine Translation

Average is about 850 words per document. total is about 1.6 million. For MT it’s a total of 2 million words can be translated over a year. we have approximately 1800 pages x 850 (avg word count per page) = 1.6 million

Acceptance Criteria

note: All Deven team members should be given access to AWS. - @zstix

Useful Links

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be automatically closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically closed because it was a stale issue that had no recent activity. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be automatically closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be automatically closed if no further activity occurs. Thank you for your contributions.

jpvajda commented 2 years ago

@rudouglas will work on AC...

moonlight-komorebi commented 2 years ago

project_id column should be string. i dont know for sure that the vendor uses non-integer ids, but string is a more flexible / forgiving type just in case.

rudouglas commented 2 years ago

@nr-kkenney is there a STRING type? I thought the main string types were VARCHAR and TEXT which are essentially the same: https://www.postgresqltutorial.com/postgresql-char-varchar-text/

moonlight-komorebi commented 2 years ago

@rudouglas that is what i meant 🙈 didnt know the db types off the top of my head, but you got to the right place 👍

at a quick glance, it looks like the project_id value is not a number, so some text type is probably what we want.

rudouglas commented 2 years ago

ah cool that makes sense 🍾