Closed sstojak1 closed 3 months ago
To ensure that the Bitbucket Server data collection task in DevLake continues even if one pull request is corrupted, you can configure the Sync Policy to "Skip Failed Tasks." This setting allows the data collection process to skip over any failed tasks, preventing the entire pipeline from failing and ensuring that you do not lose all Bitbucket metrics for the repository.
Here is the relevant section from the document:
### Step 2.3 - Set the Sync Policy
There are three settings for Sync Policy:
- Data Time Range: You can select the time range of the data you wish to collect. The default is set to the past six months.
- Sync Frequency: You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your preferred schedule.
- Skip Failed Tasks: sometime a few tasks may fail in a long pipeline; you can choose to skip them to avoid spending more time in running the pipeline all over again.
Additionally, the ConvertPullRequests
function in the pr_convertor.go
file is designed to handle errors gracefully. The api.NewDataConverter
method in the ConvertPullRequests
function is set up to manage errors effectively, allowing the data collection process to continue even if some pull requests are not found or are corrupted.
By enabling the "Skip Failed Tasks" option and leveraging the error handling in the ConvertPullRequests
function, you can ensure that the data collection process continues even if some tasks fail [1][2].
To continue talking to Dosu, mention @dosu.
@sstojak1 There is an option in project's sync config, you can have a try.
I believe it is fixed already, will be available in the coming rc
version which should be ready in a couple of days.
@d4x1 This option is already on for all our projects. Here a single task is falling since one PR is corrupt in Bitbucket. As a result, other repository information won't be collected.
@klesh
Are you referring to 7577?
7577 is connected with PR commits and it's handling 40X error status. This error is with 500 status and it's connected with PR activities.
If it's ok, we can do it like @abeizn did for commits but handle 500 + body message - com.atlassian.bitbucket.scm.CommandFailedException
?
What do you think?
Ahh.. 500 errors? I am not sure, 500 represents Server Internal Errors, It might suggest that the server is corrupted or down, in this case, it is hard to say if it is appropriate to skip the PR. It makes more sense to fix the 500 errors on the bitbucket server rather than ignoring them on the devlake end.
You're correct. Deciding whether to skip something based on the message content will be challenging. Resolving the ticket...
Search before asking
What happened
The Bitbucket Server data collection task fails because one pull request is corrupted. Error that Devlake throws:
| Retry exceeded 3 times calling rest/api/1.0/projects/{projectKey}/repos/{repoName}/pull-requests/{pullRequestId}/activities. The last error was: Http DoAsync error calling [method:GET path:rest/api/1.0/projects/{projectKey}/repos/{repoName}/pull-requests/{pullRequestId}/activities query:map[limit:[100] state:[all]]]. Response: {"errors":[{"context":null,"message":"'git update-ref --stdin -z --no-deref' exited with code 128 saying: fatal: cannot update ref 'stash-refs/pull-requests/{pullRequestId}/from': trying to write ref 'stash-refs/pull-requests/{pullRequestId}/from' with nonexistent object {commitSHA}","exceptionName":"com.atlassian.bitbucket.scm.CommandFailedException"}]} (500)
What do you expect to happen
I think it would make sense for the data collection to continue even if one pull request is corrupted since we don't want to lose all of those Bitbucket metrics for the repository.
How to reproduce
We have this kind of a state in our env. Not sure how to reproduce.
Anything else
No response
Version
v1.0.0-beta11
Are you willing to submit PR?
Code of Conduct