Open SoumayaMauthoorMOJ opened 5 months ago
@jacobwoffenden let me know if you need more info :-)
According to Github:
Personal access tokens are intended to access GitHub resources on behalf of yourself. To access resources on behalf of an organization, or for long-lived integrations, you should use a GitHub App. For more information, see "About creating GitHub Apps."
@tomholt1 as discussed, please liaise on this ticket with the AP :-)
@tomholt1 - Can we just clarify what the ask is for the AP team here. We can definitely generate a PAT for the runners, but worth us understanding what other facilitation your team will require (if any)?
Hey @Ed-Bajo, I think the context has changed a little since the ticket has been opened. We won't be needing a PAT as we will be planning to create a github app and authenticate that way using a JWT. What we require is just the relevant permissions to be able to run the above curl command in a DAG, that will trigger a github action upon the airflow jobs success
@tomholt1 if you're going down the route of a GitHub Application, I'm not sure if you need input from our team anymore, it was originally suggested because we could provision a fine-grained access token using @moj-data-platform-robot
Is this the GitHub App https://github.com/organizations/moj-analytical-services/settings/apps/airflow-dags-github-actions ?
Nope, I haven't set one up yet, although I'd be intrigued to know who set that up as I wonder if they're trying to achieve the same thing we are. So just to confirm - we don't need any updated airflow permissions to run the above curl command in an airflow dag?
I'm unable to see easily when/who created a GitHub App
As for using a GitHub App to trigger a GitHub Actions workflow, I'm not sure this is something the Analytical Platform would want to facilitate, so it would be the responsibility of Data Engineering to ensure the application is correctly configured, and the tokens are stored securely for consumption within an Airflow DAG.
It might also be worth consulting with @ministryofjustice/operations-engineering about GitHub App vs. fine-grained token for this use case.
We currently do not have a process for combining Airflow -> create-a-derived-table pipelines, apart from simply guessing the Airflow pipeline completion time and scheduling the time appropriately. There are multiple solutions, but a simple solution is to use an Airflow bash operator and a curl command to workflow_dispatch the github action e.g:
curl -L \
-X POST \
-H “Accept: application/vnd.github+json” \
-H “Authorization: Bearer <auth_token>” \
-H “X-GitHub-Api-Version: 2022-11-28” \
https://api.github.com/repos/moj-analytical-services/create-a-derived-table/actions/workflows/<action_name>/dispatches \
-d ‘{“ref”:“main”}’
The plan is now to create a Github App in moj-analytical-services
to control the permissions for the above curl request. After some digging it looks like we will need to create a JWT to authenticate the request.
More on JWT for authentication, we will be needing a PEM file & a ClientID to generate the JWT
The Github App should hold the following permissions:
I don't have the relevant access to create a github app so it would be great if this could be created and some guidance around authenticating would be great, thanks team!
To be discussed at next refinement session
@darren1988 can you invite @tomholt1 and I to the refinement session? I think @tomholt1 has made some progress on this with another team
No progress has been made as I need the Github App created, I just chased @darren1988 on this
oops :-) thanks for clarifying
Any update on this? I'm conscious @tomholt1 and I will be leaving soon :-)
@SoumayaMauthoorMOJ this ticket has been refined and is scheduled to go into our next sprint commencing on 10/10/24 - 30/10/24
Describe the feature request.
Grant airflow DAG permission to trigger create-a-derived-table GitHub workflow using the most appropriate method, which could be saving a Github personal taken to AWS Secrets/Parameters, or using a GitHub app
Describe the context.
We currently do not have a process for combining Airflow -> create-a-derived-table pipelines, apart from simply guessing the Airflow pipeline completion time and scheduling the time appropriately. There are multiple solutions, but a simple work-around is to use an Airflow bash operator and a curl command to workflow_dispatch the github action e.g:
This would require the Airflow DAG to have the relevant github permission.
Value / Purpose
No response
User Types
No response