status-im / insights-bi

1 stars 0 forks source link

Status-website: Epics queries #15

Open marcelines opened 2 months ago

marcelines commented 2 months ago

What to do

  1. Could we explore updating our current query to fetch burnup data by using the epic's (label) ID? This adjustment will align with our shift towards utilizing label IDs instead of the previous E: prefix convention.
  2. Additionally, it would be beneficial if we could develop a query that aggregates burnup data across multiple epics within a workstream. This would involve accepting an array of epic (label) IDs. Could we look into this to ensure our data reflects the collective progress of epics within any given workstream?
apentori commented 1 month ago

From my understanding

  1. To have a function gh_burnup_per_epic_by_id would imply to add the id field ingh_epics and gh_epic_issues. However gh_epics and gh_epic_issues are currently filtering the labels with name LIKE '%E:%'. If you want to totally abstract yourself from the previous convention, It would mean removing that part (and i don't know exactly the impact).
  2. The workstream is not a concept in github right ? I dont know how to make a function with a list of parameter in sql. If we can do that, that might be possible to do.

@JoseAnaya28 What do you think about that ?

marcelines commented 1 month ago

From my understanding

  1. To have a function gh_burnup_per_epic_by_id would imply to add the id field ingh_epics and gh_epic_issues. However gh_epics and gh_epic_issues are currently filtering the labels with name LIKE '%E:%'. If you want to totally abstract yourself from the previous convention, It would mean removing that part (and i don't know exactly the impact).
  2. The workstream is not a concept in github right ? I dont know how to make a function with a list of parameter in sql. If we can do that, that might be possible to do.

@JoseAnaya28 What do you think about that ?

  1. I think that your consideration is correct. The idea is to make that convention obsolete, using only the label id to fetch the necessary data. Ideally, to avoid any impact by removing that part and using the id, we could keep both ways for now, if possible.
  2. You're correct. Github is a concept that lives only in our platform, which is simply an entity that represents a group of epics (labels).
JoseAnaya28 commented 1 month ago
  1. I've added the epic id to the gh_burnup table. @apentori

  2. Yes, as long as we have a way to identify the workstreams for the epics we should be able to do it. Currently we don't have any workstream definitions though. @marcelines is this something that is planned to implement somehow?

marcelines commented 1 month ago
  1. I've added the epic id to the gh_burnup table. @apentori
  2. Yes, as long as we have a way to identify the workstreams for the epics we should be able to do it. Currently we don't have any workstream definitions though. @marcelines is this something that is planned to implement somehow?

Yes, Workstream is a concept that lives outside Github. It will only exist in our platform and represents a group of epics. So we'll need a query that aggregates burnup data across multiple epics. Should accept an array of ids as arg for that.

apentori commented 1 month ago

Deployed in prod, there is however a weird thing, for the epic id 4055362827, we have 5261 values, with 4493 lines with total_closed_issues and total_opened_issues = 0. @JoseAnaya28

It will be possible to query aggregate of gh_burnup with request like :

gh_burnup(where: {epic_id: {_in: [4055362827, 4055362827]}}) {
    date_field
    epic_id
    epic_name
    total_closed_issues
    total_opened_issues
  }

@marcelines is that good for you ?

marcelines commented 1 month ago

Deployed in prod, there is however a weird thing, for the epic id 4055362827, we have 5261 values, with 4493 lines with total_closed_issues and total_opened_issues = 0. @JoseAnaya28

It will be possible to query aggregate of gh_burnup with request like :

gh_burnup(where: {epic_id: {_in: [4055362827, 4055362827]}}) {
    date_field
    epic_id
    epic_name
    total_closed_issues
    total_opened_issues
  }

@marcelines is that good for you ?

It seems that way. I tested it in https://hasura.infra.status.im/console/ and it's giving me the error:

{
  "errors": [
    {
      "message": "field 'epic_id' not found in type: 'gh_burnup_bool_exp'",
      "extensions": {
        "path": "$.selectionSet.gh_burnup.args.where.epic_id",
        "code": "validation-failed"
      }
    }
  ]
}
apentori commented 1 month ago

https://hasura.infra.status.im/console/ is the other url for test env. Gonna take a look at why the field is not here

marcelines commented 1 month ago

One more thing:

gh_epic_issues(where: {epic_name: {_eq: "E:Bugfixes"}}) {
    assignee
    author
    epic_name
    labels
    repository
  }

According to our change, by stop using the old convention, we need to make sure that queries that expect epic_name can now accept label id.

Does it make sense?

apentori commented 1 month ago

The field epic_id is now in gh_epic_issues:

gh_epic_issues(limit: 2) {
    id
    issue_number
    issue_url
    labels
    repository
    stage
    title
    created_at
    closed_at
    author
    assignee
    epic_color  
}

(already available in prod)

marcelines commented 1 month ago

The field epic_id is now in gh_epic_issues:


gh_epic_issues(limit: 2) {

    id

    issue_number

    issue_url

    labels

    repository

    stage

    title

    created_at

    closed_at

    author

    assignee

    epic_color  

}

(already available in prod)

Did you mean in query's response?

I meant as an argument instead of epic name.

apentori commented 1 month ago

it's graphql, you can set condition in the query for every parameter that you will have in the response :

query MyQuery {
  gh_epic_issues(limit: 2, where: {epic_id: {_eq: "6014771087"}}) {
    id
    issue_number
    issue_url
    labels
    repository
    stage
    title
    created_at
    closed_at
    author
    assignee
    epic_color
  }
}
marcelines commented 1 month ago

it's graphql, you can set condition in the query for every parameter that you will have in the response :


query MyQuery {

  gh_epic_issues(limit: 2, where: {epic_id: {_eq: "6014771087"}}) {

    id

    issue_number

    issue_url

    labels

    repository

    stage

    title

    created_at

    closed_at

    author

    assignee

    epic_color

  }

}

I know that. But since that in your example you added the id in the response, I was just confirming the info. Thank you. 🙏

marcelines commented 1 month ago

it's graphql, you can set condition in the query for every parameter that you will have in the response :


query MyQuery {

  gh_epic_issues(limit: 2, where: {epic_id: {_eq: "6014771087"}}) {

    id

    issue_number

    issue_url

    labels

    repository

    stage

    title

    created_at

    closed_at

    author

    assignee

    epic_color

  }

}

I know that. But since that in your example you added the id in the response, I was just confirming the info. Thank you. 🙏

i was testing this query and i got two different errors:

1 - Seems to be related to the field that does not exist in this query response:

{
  "errors": [
    {
      "message": "field 'id' not found in type: 'gh_epic_issues'",
      "extensions": {
        "path": "$.selectionSet.gh_epic_issues.selectionSet.id",
        "code": "validation-failed"
      }
    }
  ]
}

2 - Database query error:

{
  "errors": [
    {
      "message": "database query error",
      "extensions": {
        "path": "$",
        "code": "unexpected",
        "internal": {
          "arguments": [
            "(Oid 114,Just (\"{\\\"x-hasura-role\\\":\\\"admin\\\"}\",Binary))",
            "(Oid 0,Just (\"6014771087\",Text))"
          ],
          "error": {
            "description": null,
            "exec_status": "FatalError",
            "hint": "No operator matches the given name and argument types. You might need to add explicit type casts.",
            "message": "operator does not exist: text = bigint",
            "status_code": "42883"
          },
          "prepared": true,
          "statement": "SELECT  coalesce(json_agg(\"root\" ), '[]' ) AS \"root\" FROM  (SELECT  row_to_json((SELECT  \"_e\"  FROM  (SELECT  \"_root.base\".\"issue_number\" AS \"issue_number\", \"_root.base\".\"issue_url\" AS \"issue_url\", \"_root.base\".\"labels\" AS \"labels\", \"_root.base\".\"repository\" AS \"repository\", \"_root.base\".\"stage\" AS \"stage\", \"_root.base\".\"title\" AS \"title\", \"_root.base\".\"created_at\" AS \"created_at\", \"_root.base\".\"closed_at\" AS \"closed_at\", \"_root.base\".\"author\" AS \"author\", \"_root.base\".\"assignee\" AS \"assignee\", \"_root.base\".\"epic_color\" AS \"epic_color\"       ) AS \"_e\"      ) ) AS \"root\" FROM  (SELECT  *  FROM \"public\".\"gh_epic_issues\"  WHERE ((\"public\".\"gh_epic_issues\".\"epic_id\") = (($2)::bigint))    LIMIT 2 ) AS \"_root.base\"      ) AS \"_root\"      "
        }
      }
    }
  ]
}
apentori commented 1 month ago

Problem solved for the test environment (it was already good in prod)