Directory Structure by Domain: Modified the cloning path to create subdirectories based on the 'repodomain' column from the input DataFrame.
Enhanced Error Handling: Implemented capturing of detailed error messages during the repository cloning process.
Added functionality to write these error messages, along with the repository URL, to a dedicated error log file (data/error_log.txt).
DataFrame Updates: Introduced a 'clone_status' column to track the success or failure of each cloning attempt, enhancing the script's reporting capabilities.
Script utils/initial_data_preparation.py:
updated the get_base_repo_url function to better handle a variety of Git hosting platforms.
Script tests/test_github_repo_request_local.py:
Expand test coverage for get_base_repo_url with multi-platform scenarios
Hi Julian,
Could you please review the changes in this PR and let me know if anything needs to be changed?
Key changes are: Script
src/github_repo_request_local.py
:Script
utils/initial_data_preparation.py
:get_base_repo_url
function to better handle a variety of Git hosting platforms.Script
tests/test_github_repo_request_local.py
:Thanks