Align the Chinese texts with the Tibetan version of the Ratnakuta-sutra.
2. Keyword definitions
Ratnakuta-sutra: Ratnakuta-sutra (short for Mahāratnakūṭa Sūtra) is the original Sanskrit title of the scripture, but in this project, it is used also to call the corpus of its Tibetan and Chinese canonical translations, and specifically, the [Tibetan version] and Taisho 310 大寶積經
Text alignment: In this project, “text alignment” refers to matching Chinese text to corresponding Tibetan text, basically on a sentence level; but if necessary, matching of any two semantic blocks of the two languages.
Translation glossary: An index of specific terminology with approved translations in target languages that is agreed and used among translators, here who are engaging in Tibetan to Chinese Buddhist translations. Translation glossaries aid translators in ensuring each time a defined term appears in any language, it is used correctly and consistently.
Translation memory: In this project, translation memory refers to quality data of well aligned Tibetan-Chinese Buddhist text segments (on sentence, paragraph or sentence-like unit level), used to produce translation glossaries to aid new (human) Tibetan to Chinese translations.
3. Detailed approach
Project preparation and training
Github tutorials and guidelines for alignment: in both English and Chinese versions (prepared by Stephanie)
Online project setting and folders naming: for repositories, files, and members accessibility.
Defining team members: team (alignement) members, group leaders, project managers.
Github training for group leaders.
Inter-text training for the team members: see milestones.
Introduction of the 2nd chapter of Mahāratnakūṭasūtra by Prof. Sherab Chen.
Demonstrate how good quality translation memory can benefit the actual translation process in the future.
Textual alignment (Followed by bi-weekly project meetings and evaluation and technical Q&A session.)
In Progress: team members align the texts and get help from the group leader when it is needed.
Prepare and upload 1st draft to github by the group leader.
Review by Prof. Chen.
Revise the alignment as a group.
Final approval by Prof. Chen.
Upload the final alignment to the TM repository.
4. Constraints
Group members all have limited time to work on it.
5. Other options
Refer to RFJ 7.
6. Risks and risk reduction
What could go wrong with your plan?
Answer: Unable to retain the group members.
What can you do to reduce the risks you described?
Answer: Ensure we have healthy communication and steady progress.
7. Open questions
Who will be available to provide technical support of GitHub and inter-Text if there are any possible technical questions that arise during the training process and implementation process? Answer: Github: Ives; Inter-text: Grace and Sherab Chen
How to evaluate and quantify the contributions of the translation projects? Are there any evaluation parameters? OKRs or KPIs.
Answer: This is a pilot project which we are still experimenting different methodologies so we should stay adaptive.
Answer: Project Goal: Complete the 2nd chapter by the end of January 2023.
Answer: Training: Send an self-assessment form to all the group members to see if they feel their reading ability has improved through this work.
How much this alignment project data can improve AI performance ?
Answer: Test the performance of the self-alignment model. And test Sketch engine glossary extraction performance.
Is there any existing successful translation glossaries production as our reference in other languages?
Answer: For modern language, we can refer to deepl.com. For Buddhist translation, we can refer to 84000 translator resource platform.
8. Resources
Refer to question 7.
9. Definition of acceptable end product
The project quality standard may need to depend on project evaluation parameters which should be set up prior the project start, and being considered thoroughly and implemented into the project daily progress.
We may start with the project evaluation parameters (OKRs or KPIs) by considering the following aspects: (detail discussion please refer to above items 7 for further information.)
Refer to Item 7 Q2.
10. Milestones
Each of the chapter’s translation need to go through 4 stages:
Project Preparation 項目準備
In Progress-Draft-對讀-初稿
Review-對讀-檢查
Final-終審完成
And we will be starting with Chapter 2 (第二會 - 無邊莊嚴會 - T46 བམ་པོ་དང་པོ། ), and it will be divided into 4 parts, each parts will be lead by different translators. Each of the 49 chapters will go through the above 4 stages.
Below are the draft project milestones for Chapter 2.
26 to 30th September, 2022 (Setting up the Github)
[x] The-Kumarajiva-Project/BO-to-ZH-TM#12
[x] The-Kumarajiva-Project/BO-to-ZH-TM#17
[x] The-Kumarajiva-Project/BO-to-ZH-TM#13
[x] The-Kumarajiva-Project/BO-to-ZH-TM#16
[x] The-Kumarajiva-Project/BO-to-ZH-TM#15
3rd to 7th October, 2022 (Setting up the Github)
[ ] The-Kumarajiva-Project/BO-to-ZH-TM#18
[ ] #19
10th to 14th October , 2022 (Workshop preparations)
[ ] #20
[ ] #21
17th to 28th October, 2022 (Training preparations)
[ ] #22
[ ] Including review of comment by team leader.
[ ] Assign the training schedule and trainers.
[ ] Create & finalise a Github repository and readme file.
[ ] Create & finalise a Github project for jobs and work stages and plans.
[ ] preparation of 1st Nov, 2022 kick off meeting.
31st Oct to 4th November, 2022 (Kick off meeting, preperation workshop, trainings and Q&A)
[ ] Conduct Inter-text training for all the group members at 1st Nov, 2022
[ ] Teaching on Mahāratnakūṭasūtra chapter 2 prior the actual text alignment to start. (1st week of November.)
[ ] Evaluation of training and technical Q&A.
[ ] Preperation of Inter-Text training for team members at 8th Nov, 2022
1st Nov, Kick Off Meeting including the following
[ ] Project Introduction and Goals
[ ] Introduce the content of the 2nd chapter and the next text
[ ] Introduce the team structure, working hours and work flow
[ ] Introduce inter-text training
7th Nov to 18th Nov, 2022 (Trainings Inter-Text & practise week)
[ ] Inter-text training to all team members at 8th Nov, 2022.
[ ] Practice of Github and Inter-text, including Q&A workshop. Including Create issues, Assign issues, Assign milestones, create pull requests, Merge requests.
21st Nov to 25th Nov
[ ] Prepare GitHub training to December....just before the group leaders plan to upload the reviewed files.
[ ] Test the AI model / Sketch Engine/ CAT tool .
[ ] Expecting to finish the Chapter 2 by end of semester 2023 Jan.
Regular Project Stages and Tasks:
Stage 1. Project Preparation 項目準備
[ ] Download and install Inter-Text program.
[ ] Team member to download the txt files (Tibetan & Chinese).
[ ] Project lead to divide the tibetan text into sentences or short paragraph.
Stage 2. In Progress-Draft-對讀-初稿
[ ] Aline the Chinese text into same paragraph as tibetan text in Inter-text program.
[ ] Followed by bi-weekly project meetings and evaluation and technical Q&A session.
[ ] Discuss and coordinate with team leader for the draft alignment work.
[ ] Consolidate one draft alignment work and send to team leader.
Stage 3. Review-對讀-檢查
[ ] Review by group leader.
[ ] Comments the draft alignment work, and send back to team member.
[ ] Group leader to review and finalise the alignment work and send to Consultants (Karma Palden、Sherab Chen)
Stage 4. Final-終審完成
[ ] Final Confirmation & Review by Consultants, then upload to Github - Project Final.
RFC title: RFC001 Ratnakuta-sutra Chp-2 Tib-Chi text alignment
RFJ link: https://github.com/The-Kumarajiva-Project/Admin/issues/2
Job manager: Stephanie
Collaborator(s):
Team Member (Alignment)
Group Leaders
Consultants
1
郭文宇、楊希
Grace
Sherab Chen 陳老師
2
陳詩彤、Stephanie
Grace
Sherab Chen 陳老師
3
史靜、小莉
可書
Sherab Chen 陳老師
4
怡靜、Heidi
可書
Sherab Chen 陳老師
- Ratnakuta-sutra: Ratnakuta-sutra (short for Mahāratnakūṭa Sūtra) is the original Sanskrit title of the scripture, but in this project, it is used also to call the corpus of its Tibetan and Chinese canonical translations, and specifically, the [Tibetan version] and Taisho 310 大寶積經
- Text alignment: In this project, “text alignment” refers to matching Chinese text to corresponding Tibetan text, basically on a sentence level; but if necessary, matching of any two semantic blocks of the two languages.
- Translation glossary: An index of specific terminology with approved translations in target languages that is agreed and used among translators, here who are engaging in Tibetan to Chinese Buddhist translations. Translation glossaries aid translators in ensuring each time a defined term appears in any language, it is used correctly and consistently.
- Translation memory: In this project, translation memory refers to quality data of well aligned Tibetan-Chinese Buddhist text segments (on sentence, paragraph or sentence-like unit level), used to produce translation glossaries to aid new (human) Tibetan to Chinese translations.
- Github tutorials and guidelines for alignment: in both English and Chinese versions (prepared by Stephanie)
- Online project setting and folders naming: for repositories, files, and members accessibility.
- Defining team members: team (alignement) members, group leaders, project managers.
- Github training for group leaders.
- Inter-text training for the team members: see milestones.
- Introduction of the 2nd chapter of Mahāratnakūṭasūtra by Prof. Sherab Chen.
- Demonstrate how good quality translation memory can benefit the actual translation process in the future.
- In Progress: team members align the texts and get help from the group leader when it is needed.
- Prepare and upload 1st draft to github by the group leader.
- Review by Prof. Chen.
- Revise the alignment as a group.
- Final approval by Prof. Chen.
- Upload the final alignment to the TM repository.
- What could go wrong with your plan?
- What can you do to reduce the risks you described?
-
-
-
-
- Project Preparation 項目準備
- In Progress-Draft-對讀-初稿
- Review-對讀-檢查
- Final-終審完成
- [x] The-Kumarajiva-Project/BO-to-ZH-TM#12
- [x] The-Kumarajiva-Project/BO-to-ZH-TM#17
- [x] The-Kumarajiva-Project/BO-to-ZH-TM#13
- [x] The-Kumarajiva-Project/BO-to-ZH-TM#16
- [x] The-Kumarajiva-Project/BO-to-ZH-TM#15
- [ ] The-Kumarajiva-Project/BO-to-ZH-TM#18
- [ ] #19
- [ ] #20
- [ ] #21
- [ ] #22
- [ ] Including review of comment by team leader.
- [ ] Assign the training schedule and trainers.
- [ ] Create & finalise a Github repository and readme file.
- [ ] Create & finalise a Github project for jobs and work stages and plans.
- [ ] preparation of 1st Nov, 2022 kick off meeting.
- [ ] Conduct Inter-text training for all the group members at 1st Nov, 2022
- [ ] Teaching on Mahāratnakūṭasūtra chapter 2 prior the actual text alignment to start. (1st week of November.)
- [ ] Evaluation of training and technical Q&A.
- [ ] Preperation of Inter-Text training for team members at 8th Nov, 2022
- [ ] Project Introduction and Goals
- [ ] Introduce the content of the 2nd chapter and the next text
- [ ] Introduce the team structure, working hours and work flow
- [ ] Introduce inter-text training
- [ ] Inter-text training to all team members at 8th Nov, 2022.
- [ ] Practice of Github and Inter-text, including Q&A workshop. Including Create issues, Assign issues, Assign milestones, create pull requests, Merge requests.
- [ ] Prepare GitHub training to December....just before the group leaders plan to upload the reviewed files.
- [ ] Test the AI model / Sketch Engine/ CAT tool .
- [ ] Expecting to finish the Chapter 2 by end of semester 2023 Jan.
- [ ] Download and install Inter-Text program.
- [ ] Team member to download the txt files (Tibetan & Chinese).
- [ ] Project lead to divide the tibetan text into sentences or short paragraph.
- [ ] Aline the Chinese text into same paragraph as tibetan text in Inter-text program.
- [ ] Followed by bi-weekly project meetings and evaluation and technical Q&A session.
- [ ] Discuss and coordinate with team leader for the draft alignment work.
- [ ] Consolidate one draft alignment work and send to team leader.
- [ ] Review by group leader.
- [ ] Comments the draft alignment work, and send back to team member.
- [ ] Group leader to review and finalise the alignment work and send to Consultants (Karma Palden、Sherab Chen)
- [ ] Final Confirmation & Review by Consultants, then upload to Github - Project Final.
1. Summary
Align the Chinese texts with the Tibetan version of the Ratnakuta-sutra.
2. Keyword definitions
3. Detailed approach
Project preparation and training
Textual alignment (Followed by bi-weekly project meetings and evaluation and technical Q&A session.)
4. Constraints
Group members all have limited time to work on it.
5. Other options
Refer to RFJ 7.
6. Risks and risk reduction
Answer: Unable to retain the group members.
Answer: Ensure we have healthy communication and steady progress.
7. Open questions
Who will be available to provide technical support of GitHub and inter-Text if there are any possible technical questions that arise during the training process and implementation process? Answer: Github: Ives; Inter-text: Grace and Sherab Chen
How to evaluate and quantify the contributions of the translation projects? Are there any evaluation parameters? OKRs or KPIs.
Answer: This is a pilot project which we are still experimenting different methodologies so we should stay adaptive. Answer: Project Goal: Complete the 2nd chapter by the end of January 2023. Answer: Training: Send an self-assessment form to all the group members to see if they feel their reading ability has improved through this work.
How much this alignment project data can improve AI performance ? Answer: Test the performance of the self-alignment model. And test Sketch engine glossary extraction performance.
Is there any existing successful translation glossaries production as our reference in other languages? Answer: For modern language, we can refer to deepl.com. For Buddhist translation, we can refer to 84000 translator resource platform.
8. Resources
Refer to question 7.
9. Definition of acceptable end product
The project quality standard may need to depend on project evaluation parameters which should be set up prior the project start, and being considered thoroughly and implemented into the project daily progress.
We may start with the project evaluation parameters (OKRs or KPIs) by considering the following aspects: (detail discussion please refer to above items 7 for further information.)
Refer to Item 7 Q2.
10. Milestones
Each of the chapter’s translation need to go through 4 stages:
And we will be starting with Chapter 2 (第二會 - 無邊莊嚴會 - T46 བམ་པོ་དང་པོ། ), and it will be divided into 4 parts, each parts will be lead by different translators. Each of the 49 chapters will go through the above 4 stages. Below are the draft project milestones for Chapter 2.
26 to 30th September, 2022 (Setting up the Github)
3rd to 7th October, 2022 (Setting up the Github)
10th to 14th October , 2022 (Workshop preparations)
17th to 28th October, 2022 (Training preparations)
31st Oct to 4th November, 2022 (Kick off meeting, preperation workshop, trainings and Q&A)
1st Nov, Kick Off Meeting including the following
7th Nov to 18th Nov, 2022 (Trainings Inter-Text & practise week)
21st Nov to 25th Nov
Regular Project Stages and Tasks:
Stage 1. Project Preparation 項目準備
Stage 2. In Progress-Draft-對讀-初稿
Stage 3. Review-對讀-檢查
Stage 4. Final-終審完成