apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
https://devlake.apache.org/
Apache License 2.0
2.57k stars 516 forks source link

[Bug][JiraPlugin] Failing on "collectIssues" subtask #7540

Open ajayesf opened 4 months ago

ajayesf commented 4 months ago

Search before asking

What happened

Jira plug in is faling with below error for one of our Board.

time="2024-05-28 00:04:32" level=warning msg=" [pipeline service] [pipeline #9] [task #146] [api async client] retry #0 calling agile/1.0/board/151/issue\n\tcaused by: Http DoAsync error calling [method:GET path:agile/1.0/board/151/issue query:map[expand:[changelog] jql:[updated >= '2024/05/10 00:00' ORDER BY created ASC] maxResults:[100] startAt:[0]]]. Response: <!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">\r\n\t<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n\t\r\n\t<meta http-equiv=\"Content-Type\" content=\"text/html; charset=iso-8859-1\"/>\r\n\t502 - Web server received an invalid response while acting as a gateway or proxy server.\r\n\t<style type=\"text/css\">\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t<div id=\"header\">

Server Error

\r\n\t<div id=\"content\">\r\n\t <div class=\"content-container\">
\r\n\t

502 - Web server received an invalid response while acting as a gateway or proxy server.

\r\n\t

There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server.

\r\n\t
\r\n\t\r\n\t\r\n\t\r\n\t (502)"

What do you expect to happen

Jira plugin should be ablle to collect all Data without fail

How to reproduce

After configuring a connection with Jira, Created a new Project with multiple Board as scope. And the pipeline is failing on One of the board

Atcched the complete log below.

task-146-7-4-jira.log

Anything else

No response

Version

v1.0.0-beta9

Are you willing to submit PR?

Code of Conduct

dosubot[bot] commented 4 months ago

The Jira plugin is likely failing with a 502 error on the "collectIssues" subtask due to an issue with the API request to Jira. A 502 error typically indicates a bad gateway, meaning the server received an invalid response from an inbound server.

To resolve this issue, you can try the following steps:

  1. Check the API Client Configuration: Ensure that the ApiClient is correctly configured and that the Jira server is reachable.

  2. Verify the JQL Query: The JQL query used in the request might be causing issues. Ensure that the JQL query is correctly formed and does not result in a large dataset that could overwhelm the server.

  3. Reduce Concurrency: The current concurrency level is set to 10. Reducing the concurrency might help if the server is being overwhelmed by too many simultaneous requests.

  4. Check the Response Parser: Ensure that the response parser is correctly handling the responses from Jira.

Here is the relevant code snippet for setting up the API collector and making the request:

err = apiCollector.InitCollector(api.ApiCollectorArgs{
    ApiClient: data.ApiClient,
    PageSize:  data.Options.PageSize,
    UrlTemplate: "agile/1.0/board/{{ .Params.BoardId }}/issue",
    Query: func(reqData *api.RequestData) (url.Values, errors.Error) {
        query := url.Values{}
        query.Set("jql", jql)
        query.Set("startAt", fmt.Sprintf("%v", reqData.Pager.Skip))
        query.Set("maxResults", fmt.Sprintf("%v", reqData.Pager.Size))
        query.Set("expand", "changelog")
        return query, nil
    },
    GetTotalPages: GetTotalPagesFromResponse,
    Concurrency:   10,
    ResponseParser: func(res *http.Response) ([]json.RawMessage, errors.Error) {
        var data struct {
            Issues []json.RawMessage `json:"issues"`
        }
        blob, err := io.ReadAll(res.Body)
        if err != nil {
            return nil, errors.Convert(err)
        }
        err = json.Unmarshal(blob, &data)
        if err != nil {
            return nil, errors.Convert(err)
        }
        return data.Issues, nil
    },
})
if err != nil {
    return err
}
return apiCollector.Execute()

Consider reducing the Concurrency value and verifying the jql query to ensure it is not causing issues. If the problem persists, further investigation into the network and server configurations might be necessary [1].

#### Details **About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

klesh commented 4 months ago

Hi, seems like your JIRA server is down or broken somehow. I think you need to fix your JIRA server and make sure the API can be requested successfully before Apache DevLake can collect data from it.

ajayesf commented 4 months ago

@klesh The server is working fine. We have other boards also hosted in the same server and I am able to pull data from rest of the Board. I am only facing issue for a specific board. Evertime it is failing on the same sub task.

ajayesf commented 4 months ago

Hi,

Could anyone help us on this?

ajayesf commented 4 months ago

We have also verified the Permssion of the Jira connection with an Admin account. The issue is still persist. @klesh

klesh commented 4 months ago

OK, normally 502 means Server Internal Error. So, did you try requesting the JIRA API with curl or postman to make sure the JIRA API is up and running and working properly?

ajayesf commented 4 months ago

image APIs are working fine. Please find the attached @klesh

ajayesf commented 4 months ago

To be specific, The task is faling at " Http DoAsync error calling [method:GET path:agile/1.0/board/151/issue query:map[expand:[changelog] jql:[updated >= '2024/05/01 00:00' ORDER BY created ASC] maxResults:[100] startAt:[0]]]." . Follows to this we are getting the 502 issue.

All tasks before "collectissue" subtask is passing.

time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] start executing task: 919" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] start plugin" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] [api async client] creating scheduler for api \"https://jira.somatus.com/rest/\", number of workers: 13, 10000 reqs / 1h0m0s (interval: 360ms)" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] total step: 32" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask collectBoardFilterBegin" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectBoardFilterBegin] collect board in collectBoardFilterBegin: 151" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectBoardFilterBegin] collect board filter:14167" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectBoardFilterBegin] collect board filter jql:project = RQC ORDER BY Rank ASC" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 1 / 32" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask collectStatus" time="2024-05-30 14:12:43" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectStatus] start api collection" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectStatus] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectStatus] end api collection without error" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 2 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask extractStatus" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractStatus] extract Status, connection_id=2, board_id=151" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractStatus] get data from _raw_jira_api_status where params={\"ConnectionId\":2,\"BoardId\":151} and got 42" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractStatus] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 3 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask collectProjects" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectProjects] collect projects" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectProjects] start api collection" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectProjects] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectProjects] end api collection without error" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 4 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask extractProjects" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractProjects] get data from _raw_jira_api_projects where params={\"ConnectionId\":2,\"BoardId\":151} and got 13" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractProjects] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 5 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask collectIssueTypes" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssueTypes] collect issue_types" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssueTypes] start api collection" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssueTypes] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssueTypes] end api collection without error" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 6 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask extractIssueType" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractIssueType] extract IssueType, connection_id=2, board_id=151" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractIssueType] get data from _raw_jira_api_issue_types where params={\"ConnectionId\":2,\"BoardId\":151} and got 6" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [extractIssueType] finished records: 1" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] finished step: 7 / 32" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] executing subtask collectIssues" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssues] got user's timezone: Asia/Kolkata" time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline #47] [task #919] [collectIssues] start api collection"

@klesh

d4x1 commented 4 months ago

@ajayesf From the logs:

time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline https://github.com/apache/incubator-devlake/pull/47] [task https://github.com/apache/incubator-devlake/issues/919] finished step: 7 / 32"
time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline https://github.com/apache/incubator-devlake/pull/47] [task https://github.com/apache/incubator-devlake/issues/919] executing subtask collectIssues"
time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline https://github.com/apache/incubator-devlake/pull/47] [task https://github.com/apache/incubator-devlake/issues/919] [collectIssues] got user's timezone: Asia/Kolkata"
time="2024-05-30 14:12:44" level=info msg=" [pipeline service] [pipeline https://github.com/apache/incubator-devlake/pull/47] [task https://github.com/apache/incubator-devlake/issues/919] [collectIssues] start api collection"

It seems collectIssues starts successfully.

And this https://github.com/apache/incubator-devlake/issues/7540#issuecomment-2139288671 just show orgnization api is working, you should get the api that collectIssues are using, and try to request it with curl or postman.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

klesh commented 3 weeks ago

Maybe the request frequency was too high for the JIRA server to handle, try decreasing the rate limit in the connection page and see how it goes.

dosubot[bot] commented 3 weeks ago

The JiraPlugin is failing on the "collectIssues" subtask with a 502 error when calling the agile/1.0/board/151/issue endpoint. This error typically indicates a bad gateway, meaning the server received an invalid response from an upstream server. Here are some potential causes and solutions:

  1. Server Overload: The Jira server might be overloaded or experiencing issues. Check the server status and ensure it is running smoothly.

  2. Network Issues: There might be network connectivity issues between the plugin and the Jira server. Verify the network connection and ensure there are no interruptions.

  3. API Rate Limiting: The Jira API might be rate-limited, causing the request to fail. Implement rate limiting in your API client to avoid hitting the Jira API rate limits.

  4. Retry Mechanism: Implement a retry mechanism to handle transient issues. Here is an example of how you might implement a retry mechanism in the getBoardFilterId function:

    func getBoardFilterId(data *JiraTaskData) (string, error) {
        url := fmt.Sprintf("agile/1.0/board/%d/configuration", data.Options.BoardId)
        var boardConfiguration *http.Response
        var err error
        for i := 0; i < 3; i++ { // Retry up to 3 times
            boardConfiguration, err = data.ApiClient.Get(url, nil, nil)
            if err == nil {
                break
            }
            time.Sleep(2 * time.Second) // Wait before retrying
        }
        if err != nil {
            return "", err
        }
        bc := &BoardConfiguration{}
        err = helper.UnmarshalResponse(boardConfiguration, bc)
        if err != nil {
            return "", err
        }
        filterId := bc.Filter.ID
        return filterId, nil
    }

This code retries the API call up to three times with a 2-second delay between attempts [1].

Additionally, ensure that your Jira API token has sufficient permissions and that the connection settings are correctly configured. If the issue persists, you may need to investigate further into the specific board configuration or data that might be causing the problem.

To continue talking to Dosu, mention @dosu.