apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
https://devlake.apache.org/
Apache License 2.0
2.5k stars 493 forks source link

[Bug][Azure devops plugin] Azure Devop plugin doesn't pull in current data #7048

Open michaelawarren opened 4 months ago

michaelawarren commented 4 months ago

Search before asking

What happened

The Azure Devops Plugin is pulling in very old builds and not new / recent builds.

Configuration is to pull in latest data image

Database shows that the dates are for old builds image image

_tool data indicates the same storry image image

What do you expect to happen

I expect when i use the default config for dev lake to pull the most recent builds.

How to reproduce

set up an azure devops project to pull in data from and azure devops pipeline following all the indicated steps in the app and run.

Anything else

No response

Version

0.21.0-beta4

Are you willing to submit PR?

Code of Conduct

Startrekzky commented 4 months ago

Hi @d4x1 , could you take a look at this issue? I looked at it and I thought it was a bug.

d4x1 commented 4 months ago

@michaelawarren AzureDevops is implemented with Pythin, whose behaviors are different from other golang based plugins more or less. I think you're right. First, I haven't found any codes that process the time range option. It's a bug definitly. Second, althrough the time range doesn't work, you can still collect your repos' all build therotically. I have test it locally and it works.

So you can visit this api https://dev.azure.com/{org}/{project}/_apis/build/builds?repositoryId={repo_id}&repositoryType=tfsgit&deletedFilter=excludeDeleted (api detail) and see it's returns.

Repo id can be found with api: https://dev.azure.com/{org}/{project}/_apis/git/repositories/, and repositoryType should be set correctly depends on your repo's config.

cc @Startrekzky @keon94 Do you have any ideas about this issue?

michaelawarren commented 4 months ago

I tested the endpoints and they give the same information that exists in devops. So that leads me to a new question. The pipelines do have more recent builds however those builds are linked to github repos not azure dev ops repos. How do i still get the azure dev ops builds that are linked to github repos?

d4x1 commented 4 months ago

I don't understand what's your current config and what do you expect. Please decribe them in detail as much as possible.

michaelawarren commented 4 months ago

Config:

Expectation:

d4x1 commented 3 months ago

@michaelawarren I know your use case. Thank you for your feedback.

Can you see your GitHub repos in Azure DevOps' repo page ?

image

When creating Azure DevOps connection in DevLake, which data scope have you chosen?

image
michaelawarren commented 3 months ago

No we do not see the git hub repos in azure devops repo list. We see the old ado repos.

In the add data scopes page i've been selecting the ADO repo. Which explains why i only see the old build since it only shows builds associated with the ADO repo and no builds associated with the new GitHub Repos.

In the data scopes for some projects i do see some github repos. When i do see git repos none of them are the new repos i want to collect dora metrics for.

d4x1 commented 3 months ago

@michaelawarren Thank you for your reply.

If your GitHub repos can be selected in Azure DevOps connection, the you can couple repos and pipelines correctly. I think it's a bug.

@keon94 Can you help to solve this problem?

I think there are two bugs we need to fix:

  1. Azure DevOps should only collect data according to the time range in projects config.
  2. Support/Fix repos these are from GitHub/BitBucket and so on when configurating scopes in Azure DevOps connection.
klesh commented 3 months ago

@keon94 Would you like to work on it? Thanks.

keon94 commented 3 months ago

I probably won't do the dev work for this but I can try to help. @michaelawarren are the missing repos private? Anything that makes them different than the ones that do show up on the list? The plugin does support github repos with azure devops pipelines.

klesh commented 3 months ago

@keon94 Can you support the timeAfter?

michaelawarren commented 3 months ago

All repos are private repos. I'm not sure what sets aside the ones that can be seen vs the ones that cannot be seen. They should be configured the same.

d4x1 commented 3 months ago

@keon94 We need your help~

d4x1 commented 3 months ago

All repos are private repos. I'm not sure what sets aside the ones that can be seen vs the ones that cannot be seen. They should be configured the same.

I am not sure either. Some debugging and test works are needed.

It will be appreciated if you can provide some clues.

keon94 commented 3 months ago

The APIs this plugin deals with are here. Those of interest are git_repos (azdevops managed repos) and external__repositories (unmanaged repos that are referenced by azure builds). Sounds like git_repos works as expected. You need to test what external_repositories returns via curl/postman.

keon94 commented 3 months ago

@d4x1 This PR will add debug logging around API calls in PyDevLake. Please review it: https://github.com/apache/incubator-devlake/pull/7206

klesh commented 3 months ago

I just took he liberty and merged the PR. @keon94 @d4x1

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] commented 1 month ago

This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.