Open shankari opened 1 month ago
I have segregated the tasks into two individual groups (CI/CD, Core code) + one combo group(CI/CD + Core)
A. CI/CD
B. Core code
C. Core Code + CI/CD
The order I plan to work on these: A. CI/CD: 7, 6, 2, 8, 4, 1, 3, 5 B. Core: 2, 1 C. Core + CI/CD: 1, 2
Fixed as a part of the redesign changes itself.
Added certificates externally
only in server repo (commit)
Only need them in images that need to connect to the AWS Document DB (comment)
Since the same base server image is used by the server image dependent containers (webapp, analysis, admin-dash, public-dash-notebook), we have added it right at the source image which is the external server image. This way it is ensured that the certificates are present in all the cascading images.
Based on comment below, added them to internal and removed from external.
@MukuFlash03 right we have implemented this. But I am suggesting that we revisit that decision
Discuss where to put in the cert configuration. We originally had it in the internal repos, then we moved it to the external repos. But now that we have one internal dockerfile per external dockerfile, maybe we can have it be internal after all. It doesn't actually hurt anything to be external, but it is unnecessary in the external repo.
Shankari's comments
Consider switching from $GITHUB_ENV to step outputs. I think that the step outputs are the recommended approach to pass values from one step to another, but we should verify
I couldn’t find any official statement on which is the recommended approach. But I did see this warning that says “set-output” is deprecated, switch to using environment files
Note that this doesn’t say that “steps.output”
is deprecated; it says “set-output”
is deprecated.
So, step outputs are still valid and we essentially have to choose between: GITHUB_ENV and GITHUB_OUTPUT
One argument in favor of GITHUB_OUTPUT
is this:
GITHUB_OUTPUT
documentation mentions that job outputs available to all dependent downstream jobs.GITHUB_OUTPUT
could be more suited.Shankari's comments
Explore whether we can stop uploading the date of of the run as an artifact since we are using an .env file in the dashboards. Can't this just be an output of the run?
REST API endpoints
I looked at some REST API endpoints to see if we can access outputs of jobs/steps in a run outside of the workflow run. We can then use them directly in the internal script to pull all tags.
Not Suitable
Get a workflow run I did not find an API endpoint to directly reference outputs inside in a job. This endpoint gets details about a workflow run but does not have info on outputs.
List jobs for a workflow run This lists all jobs and steps but it only lists their completion status, time of execution and not outputs or any details.
Download workflow run logs
I did however find an API endpoint that allows downloading workflow run logs.
This does give the outputs but it also gives everything from the logs as seen in the UI.
This is a lot of redundant information that we will need to parse to only fetch the outputs.
It downloads logs a zip file which would again be another hassle to: download -> extract -> read / parse -> clean up files
.
Optimal Approach
base64 encoded
format:...
"download_url": "https://raw.githubusercontent.com/MukuFlash03/em-public-dashboard/cleanup-cicd/.env",
"type": "file",
"content": "Tk9URUJPT0tfSU1BR0VfVEFHPTIwMjQtMDktMjAtLTEzLTM2CkZST05URU5E\nX0lNQUdFX1RBRz0yMDI0LTA5LTIwLS0wNC0zOApTRVJWRVJfSU1BR0VfVEFH\nPTIwMjQtMDktMjAtLTA5LTEwCg==\n",
...
Decoding this base64 data (for decoding, the newline characters '\n\ need to be removed and combine the entire encoded string)
NOTEBOOK_IMAGE_TAG=2024-09-20--13-36
FRONTEND_IMAGE_TAG=2024-09-20--04-38
SERVER_IMAGE_TAG=2024-09-20--09-10
Task A-2: Avoid uploading date as Artifacts -> Use .env file instead? [Contd.]
With the API endpoint mentioned above, we can directly read contents of files.
Proposed Approach
To have a separate .env
file in each of the repositories: server, join-page, admin-dash, public-dash.
This file would contain the image tags of the latest uploaded images.
Server image tag is needed in the dashboard repos' .env files since the Dockerfiles use the server tag as an ARG
In server repo: SERVER_IMAGE_TAG
In join-page repo: JOIN_IMAGE_TAG
In admin-dash repo: ADMIN_DASH_IMAGE_TAG
, SERVER_IMAGE_TAG
In public-dash: PUBLIC_DASH_NOTEBOOK_IMAGE_TAG
, PUBLIC_DASH_FRONTEND_IMAGE_TAG
, SERVER_IMAGE_TAG
Pros of this approach:
.env
in each repo that can store all image tags relevant to the repository..env
for server tag, .env.repoTags
for frontend tag).SERVER_IMAGE_TAG
from the .env file in server repo directly.public-dash
, still need the checks as frontend tag depends on event type. why use something complicated which retrieves the file as a base64 encoded string instead of just reading the files directly using the raw
option?
why use something complicated which retrieves the file as a base64 encoded string instead of just reading the files directly using the raw option?
I just thought of sticking to using Github REST API and was looking for options within that.
But reading the raw
contents as text is a much simpler option indeed.
It also does not require any headers, authorization token for the request which is valid since it is publicly available data anyways.
Will switch out the URLs and remove the base64 code.
Thank you for pointing that out!
Currently the docker images are tagged as
However, this causes a problem if the docker image upload is done from a different branch in the external GitHub workflow, since the complete returned tag in the internal script would still have the default hardcoded branch_name but latest timestamp.
Hence now storing the same tag that the docker image is tagged with including both the branch_name and timestamp.
Copied over pending tasks from: https://github.com/e-mission/e-mission-server/pull/961#issuecomment-2272509467
ghcr.io
so that we don't run into issues with usage limits.env
file in the dashboards. Can't this just be an output of the run?secret
auth method since it is unusede-mission-common
to the various repose-mission-core
as a separate library that is similar toe-mission-common
e-mission-core
into the dashboard repos then; we don't need the full analysis pipeline<main_branch>_<date>
but the uploaded/stored tag is only thedate
, so we need to prepend the branch externally. And the branch is not always the same.