Closed amokkapati closed 2 weeks ago
To modify the configuration so that it records file names that were updated during git commits and pull requests, you need to ensure that the data collection includes the necessary entities related to commits and pull requests. Specifically, you should ensure that the Source Code Management
and Code Review
data entities are selected, as these will include information about commits and pull requests.
Here is the relevant section from the configuration documentation that you need to focus on:
### Data Entities
Usually, you don't have to modify this part. However, if you don't want to collect certain GitHub entities, you can unselect some entities to accelerate the collection speed.
- Issue Tracking: GitHub issues, issue comments, issue labels, etc.
- Source Code Management: GitHub repos, refs, commits, etc.
- Code Review: GitHub PRs, PR comments and reviews, etc.
- CI/CD: GitHub Workflow runs, GitHub Workflow jobs, etc.
- Cross Domain: GitHub accounts, etc.
Ensure that Source Code Management
and Code Review
are selected to collect data about commits and pull requests, which will include the file names that were updated.
Additionally, you may need to configure the transformation rules to parse the necessary information from the collected data. Here is how you can add transformation rules:
## Step 3 - Adding Transformation Rules (Optional)
![github-add-transformation-rules-list](images/github-set-transformation1.png)
![github-add-transformation-rules](images/github-set-transformation2.png)
Without adding transformation rules, you can still view the "[GitHub Metrics](/livedemo/DataSources/GitHub)" dashboard. However, if you want to view "[Weekly Bug Retro](/livedemo/EngineeringLeads/WeeklyBugRetro)", "[Weekly Community Retro](/livedemo/OSSMaintainers/WeeklyCommunityRetro)" or other pre-built dashboards, the following transformation rules, especially "Type/Bug", should be added.
Each GitHub repo has at most ONE set of transformation rules.
### Code Review
- Type: The `type` of pull requests will be parsed from PR labels by given regular expression. For example:
- when your labels for PR types are like 'type/feature-development', 'type/bug-fixing' and 'type/docs', please input 'type/(.\*)$'
- when your labels for PR types are like 'feature-development', 'bug-fixing' and 'docs', please input '(feature-development|bug-fixing|docs)$'
- Component: The `component` of pull requests will be parsed from PR labels by given regular expression.
By ensuring the correct data entities are selected and configuring the transformation rules, you can collect and record the file names that were updated during git commits and pull requests [1][2].
To continue talking to Dosu, mention @dosu.
@dosu [ [ { "plugin": "org", "subtasks": [ "setProjectMapping" ], "options": { "projectMappings": [ { "projectName": "GHEDev", "scopes": [ { "rowId": "github:GithubRepo:1:7345", "table": "repos" }, { "rowId": "github:GithubRepo:1:7345", "table": "cicd_scopes" }, { "rowId": "github:GithubRepo:1:7345", "table": "boards" } ] } ] } }, { "plugin": "github", "subtasks": [ "Convert Releases", "Convert Repos", "Extract Comments", "Extract Events", "Extract Issues", "Extract Milestones", "Extract PR Review Comments", "Extract Pull Requests", "Extract Users", "Extract Workflow Runs", "Convert Issue Assignees", "Convert Issue Comments", "Convert Issue Labels", "Convert Issues", "Convert Milestones", "Convert PR Labels", "Convert Pull Requests", "Convert Workflow Runs", "Enrich PR Issues", "Convert PR Issues", "Extract Jobs", "Extract PR Commits", "Extract PR Reviews", "Convert Jobs", "Convert PR Comments", "Convert PR Commits", "Convert PR Reviews", "Extract User Org", "Convert Users" ], "options": { "connectionId": 1, "fullName": "INFServices/terraform-base", "githubId": 7345, "name": "INFServices/terraform-base" } }, { "plugin": "dora", "subtasks": [ "generateDeployments", "generateDeploymentCommits", "enrichPrevSuccessDeploymentCommits" ], "options": { "projectName": "GHEDev" } }, { "plugin": "refdiff", "subtasks": [ "calculateDeploymentCommitsDiff" ], "options": { "projectName": "GHEDev" } }, { "plugin": "dora", "subtasks": [ "calculateChangeLeadTime", "ConnectIncidentToDeployment" ], "options": { "projectName": "GHEDev" } }, { "plugin": "gitextractor", "subtasks": null, "options": { "repoId": "github:GithubRepo:7345", "url": "https://github.com/apache/incubator-devlake.git" } } ] ]
Where in here do I set the "SkipCommitFiles" section to false for gitextractor
To set the "SkipCommitFiles" option to false for the "gitextractor" plugin, you need to modify the options
section for the gitextractor
plugin in your configuration. Here is how you can do it:
[
{
"plugin": "org",
"subtasks": [
"setProjectMapping"
],
"options": {
"projectMappings": [
{
"projectName": "GHEDev",
"scopes": [
{
"rowId": "github:GithubRepo:1:7345",
"table": "repos"
},
{
"rowId": "github:GithubRepo:1:7345",
"table": "cicd_scopes"
},
{
"rowId": "github:GithubRepo:1:7345",
"table": "boards"
}
]
}
]
}
},
{
"plugin": "github",
"subtasks": [
"Convert Releases",
"Convert Repos",
"Extract Comments",
"Extract Events",
"Extract Issues",
"Extract Milestones",
"Extract PR Review Comments",
"Extract Pull Requests",
"Extract Users",
"Extract Workflow Runs",
"Convert Issue Assignees",
"Convert Issue Comments",
"Convert Issue Labels",
"Convert Issues",
"Convert Milestones",
"Convert PR Labels",
"Convert Pull Requests",
"Convert Workflow Runs",
"Enrich PR Issues",
"Convert PR Issues",
"Extract Jobs",
"Extract PR Commits",
"Extract PR Reviews",
"Convert Jobs",
"Convert PR Comments",
"Convert PR Commits",
"Convert PR Reviews",
"Extract User Org",
"Convert Users"
],
"options": {
"connectionId": 1,
"fullName": "INFServices/terraform-base",
"githubId": 7345,
"name": "INFServices/terraform-base"
}
},
{
"plugin": "dora",
"subtasks": [
"generateDeployments",
"generateDeploymentCommits",
"enrichPrevSuccessDeploymentCommits"
],
"options": {
"projectName": "GHEDev"
}
},
{
"plugin": "refdiff",
"subtasks": [
"calculateDeploymentCommitsDiff"
],
"options": {
"projectName": "GHEDev"
}
},
{
"plugin": "dora",
"subtasks": [
"calculateChangeLeadTime",
"ConnectIncidentToDeployment"
],
"options": {
"projectName": "GHEDev"
}
},
{
"plugin": "gitextractor",
"subtasks": null,
"options": {
"repoId": "github:GithubRepo:7345",
"url": "https://github.com/apache/incubator-devlake.git",
"skipCommitFiles": false
}
}
]
In this configuration, the skipCommitFiles
option is set to false
for the gitextractor
plugin, ensuring that it records file names that were updated during git commits and pull requests [1][2].
To continue talking to Dosu, mention @dosu.
@dosu subtask Clone Git Repo ended unexpectedly Wraps: (2) git exit error Error types: (1) hintdetail.withDetail (2) errors.errorString this is the error that I get when I try to add my own repo under url.
To resolve the error "subtask Clone Git Repo ended unexpectedly Wraps: (2) git exit error Error types: (1) hintdetail.withDetail (2) errors.errorString" and successfully add your repository, follow these steps:
Check Proxy Settings: Ensure that the proxy settings are correctly configured if you are using a proxy. Verify that the taskData.Options.Proxy
is set correctly.
Verify Private Key Handling: If you are using SSH with a private key, ensure that the private key and passphrase are correctly handled. The code creates a temporary file for the private key and sets the appropriate permissions. Ensure that this process is not failing.
Check Git Command Execution: Ensure that the execGitCommand
and execGitCommandIn
functions are correctly handling the git commands and their environment. Verify that the commands are correctly formed and that the environment variables are set properly.
Handle Shallow Clone Issues: Ensure that the --depth=1
and --shallow-since
options are used correctly if you are performing a shallow clone.
Additionally, ensure that the CloneGitRepo
function in the CloneGitRepoMeta
subtask is correctly implemented. Here is the relevant part of the code:
func CloneGitRepo(subTaskCtx plugin.SubTaskContext) errors.Error {
taskData, ok := subTaskCtx.GetData().(*parser.GitExtractorTaskData)
if (!ok) {
panic("git repo reference not found on context")
}
op := taskData.Options
storage := store.NewDatabase(subTaskCtx, op.RepoId)
var err errors.Error
logger := subTaskCtx.GetLogger()
// temporary dir for cloning
localDir, e := os.MkdirTemp("", "gitextractor")
if e != nil {
return errors.Convert(e)
}
// clone repo
repoCloner := parser.NewGitcliCloner(subTaskCtx)
err = repoCloner.CloneRepo(subTaskCtx, localDir)
if err != nil {
if errors.Is(err, parser.ErrNoData) {
taskData.SkipAllSubtasks = true
return nil
}
return err
}
if repoCloner.IsIncremental() {
storage.SetIncrementalMode(repoCloner.IsIncremental())
}
// We have done comparison experiments for git2go and go-git, and the results show that git2go has better performance.
var repoCollector parser.RepoCollector
if *taskData.Options.UseGoGit {
repoCollector, err = parser.NewGogitRepoCollector(localDir, op.RepoId, storage, logger)
} else {
repoCollector, err = parser.NewLibgit2RepoCollector(localDir, op.RepoId, storage, logger)
}
if err != nil {
return err
}
// inject clean up callback to remove the cloned dir
cleanup := func() {
_ = os.RemoveAll(localDir)
}
if e := repoCollector.SetCleanUp(cleanup); e != nil {
return errors.Convert(e)
}
// pass the collector down to next subtask
taskData.GitRepo = repoCollector
subTaskCtx.TaskContext().SetData(taskData)
return nil
}
Self-Signed Certificates: If you are using a private GitLab server with a self-signed certificate, add your root CA to the devlake
container. Mount your rootCA.crt
into the container and update the CA certificates.
Example for docker-compose
:
devlake:
image: apache/devlake:v...
...
volumes:
...
- /path/to/your/rootCA.crt:/usr/local/share/ca-certificates/rootCA.crt
command: [ "sh", "-c", "update-ca-certificates; lake" ]
...
Upgrade to Latest Version: Consider upgrading to the latest version of DevLake (v0.15.x or later) as some bugs have been fixed in newer versions.
Check Token Permissions: Ensure that your access token has the necessary permissions. If your account is protected by organization SAML enforcement, make sure you've authorized the token using SSO.
Error Code 403: If you encounter an error with the status code 403, it is usually caused by the permission of your tokens. Ensure you are using a supported authentication method and that your token has the necessary permissions.
Error Code 429: If you encounter an error with the status code 429, it is usually caused by rate limiting. Enable fixed rate limit
in the GitLab connection and lower the API rates to 2,000. If it works, you can try increasing the rates to accelerate [1][2][3][4].
To continue talking to Dosu, mention @dosu.
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.
[ [ { "plugin": "org", "subtasks": [ "setProjectMapping" ], "options": { "projectMappings": [ { "projectName": "GHEDev", "scopes": [ { "rowId": "github:GithubRepo:1:7345", "table": "repos" }, { "rowId": "github:GithubRepo:1:7345", "table": "cicd_scopes" }, { "rowId": "github:GithubRepo:1:7345", "table": "boards" } ] } ] } }, { "plugin": "github", "subtasks": [ "Convert Releases", "Convert Repos", "Extract Comments", "Extract Events", "Extract Issues", "Extract Milestones", "Extract PR Review Comments", "Extract Pull Requests", "Extract Users", "Extract Workflow Runs", "Convert Issue Assignees", "Convert Issue Comments", "Convert Issue Labels", "Convert Issues", "Convert Milestones", "Convert PR Labels", "Convert Pull Requests", "Convert Workflow Runs", "Enrich PR Issues", "Convert PR Issues", "Extract Jobs", "Extract PR Commits", "Extract PR Reviews", "Convert Jobs", "Convert PR Comments", "Convert PR Commits", "Convert PR Reviews", "Extract User Org", "Convert Users" ], "options": { "connectionId": 1, "fullName": "INFServices/terraform-base", "githubId": 7345, "name": "INFServices/terraform-base" } }, { "plugin": "dora", "subtasks": [ "generateDeployments", "generateDeploymentCommits", "enrichPrevSuccessDeploymentCommits" ], "options": { "projectName": "GHEDev" } }, { "plugin": "refdiff", "subtasks": [ "calculateDeploymentCommitsDiff" ], "options": { "projectName": "GHEDev" } }, { "plugin": "dora", "subtasks": [ "calculateChangeLeadTime", "ConnectIncidentToDeployment" ], "options": { "projectName": "GHEDev" } } ] ]