TogetherCrew / airflow-dags

1 stars 1 forks source link

[Hivemind] Create GithubSummaryTransformer class #94

Open cyri113 opened 6 months ago

cyri113 commented 6 months ago

Part of the Github Vectorize (Summary). Please read this document before starting.

The following changes should be implemented in dags/hivemind_etl_helpers/src/db/github/transform.

### Tasks
- [x] Create a class `GithubSummaryTransformer`
- [x] Inherit from [SummaryTransformer](https://github.com/TogetherCrew/airflow-dags/blob/feat/hivemind-summary-transformer-abstract/dags/hivemind_etl_helpers/src/utils/summary/summary_transformer.py). See [example](https://github.com/TogetherCrew/airflow-dags/blob/feat/hivemind-summary-transformer-abstract/dags/hivemind_etl_helpers/src/db/discord/summary/summary_utils.py).
- [x] Create a public method `transform` that transforms the summary to a llama-index document and add the required metadata. See [details](https://www.notion.so/rndadocs/Github-Vectorize-Summary-fe006094d382427eb1daf746a9055849?pvs=4#2dc0f77693eb47e5a4aaa040126b74ee).
- [x] Create the required test cases

Note: file coverage should be 100%.

cyri113 commented 5 months ago

Hi @FatemehVahabi could you please create one PR for all your changes? I'm a little confused with this.