Closed amindadgar closed 7 months ago
The latest update introduces a new task in the GitHub DAG for extracting pull requests linked to commits, enhancing the data processing workflow. Integration tests ensure data integrity, focusing on preventing duplicate entries in the Neo4j database.
File Path | Change Summary |
---|---|
dags/github.py |
Added a new task extract_commit_pull_requests to extract pull requests for each commit, integrated it into the workflow, and updated the DAG to load commit pull requests alongside other data processing steps. |
dags/github/tests/integration/.../test_save_commit_relation_to_pr.py |
Contains test cases for saving commit relations to pull requests in a Neo4j database, including tests for handling empty inputs and a single pull request with associated user and commit data. |
dags/github/tests/integration/.../test_save_commit_relation_to_pr_check_commit_duplicate.py |
Contains a test case that ensures no duplicate nodes are created when saving commit relations to a pull request in a Neo4j database. The test validates that existing commit nodes are not duplicated during the process. |
🐇✨ In the code's burrow, deep and wide, A new task hops with joy and pride. Pull requests and commits prance, In Neo4j, they find their dance. No duplicates in this coder's guide, Tests ensure with eyes open wide. 🌟📜
In case of some files having no changes, the sha will be null and this could cause the query to break and added the feature to extract commits' pull request.
Summary by CodeRabbit
New Features
Tests