OpenPecha / Requests

RFWs and RFCs for all OpenPecha repositories
0 stars 0 forks source link

[RFC0147] BO-EN Aligner refactor #428

Open tenzin3 opened 5 months ago

tenzin3 commented 5 months ago

RFC0147: BO-EN Aligner refactor

Named Concepts

aligner: the aligner we are referring here is a pipeline which align Tibetan sentences with its equivalent english sentences

Summary

We are modifying the existing aligner pipeline that has a few issues. Issues includes

Dependencies

Infrastructures

Design Illustrations

Diagram Server Aligner logs

Justification

The current aligner pipeline faces key issues: it lacks a robust logging system, making it difficult to diagnose failures, and its reliance on the GitHub API leads to frequent code breakdowns due to rate limits. Additionally, the input format needs refinement for better readability, and the inability to run multiple aligners simultaneously slows down the alignment process. Addressing these issues is essential for improving the pipeline's efficiency and reliability.

Testing

Test the enhanced aligner pipeline with 10 pairs of Tibetan (BO) and English (EN) files to assess parallel processing and logging efficiency

Implementation Steps

List all the steps involved during implementation.

Reviewed By

@TenzinGayche