sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.18k stars 184 forks source link

local-dev-cluster repo is not correctly cloned when running "make cluster-up" #721

Closed jiere closed 1 year ago

jiere commented 1 year ago

Describe the bug When running "make cluster-up", the local-dev-cluster repo is not correctly cloned. This may be due to change here

To Reproduce Steps to reproduce the behavior:

  1. Run "make cluster-up" or just run "git clone -b v0.0.0 https://github.com/sustainable-computing-io/local-dev-cluster.git --depth=1"

  2. See error msg below:

Note: switching to 'cc8cc366a0f86c891ff867aca914a95af535e2a6'.

You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may do so (now or later) by using -c with the switch command. Example:

git switch -c

Or undo this operation with:

git switch -

Turn off this advice by setting config variable advice.detachedHead to false

$ cd local-dev-cluster/ [jie@jie-dev local-dev-cluster]$ git branch

  1. Check code in local-dev-cluster directory, both main.sh and kind/common.sh are not up-to-date with the repo.

Expected behavior Correctly git clone the local-dev-cluster and do not break the new feature for #718

Additional context when remove the "-b v0.0.0" words from the git clone, no issue found yet.

jiere commented 1 year ago

@SamYuan1990

SamYuan1990 commented 1 year ago

en? https://github.com/sustainable-computing-io/kepler/actions/runs/5134947333/jobs/9239675792 is not breaking? as check out with specific version is git cli feature?

SamYuan1990 commented 1 year ago

or I am confusing as https://github.com/sustainable-computing-io/kepler/actions/runs/5134947333/jobs/9239675792#step:4:6 as the same git cli command has different result at your local or...? then I will request @jiere provides more details on your local environment.

SamYuan1990 commented 1 year ago
git clone -h
usage: git clone [<options>] [--] <repo> [<dir>]

    -v, --verbose         be more verbose
    -q, --quiet           be more quiet
    --progress            force progress reporting
    -n, --no-checkout     don't create a checkout
    --bare                create a bare repository
    --mirror              create a mirror repository (implies bare)
    -l, --local           to clone from a local repository
    --no-hardlinks        don't use local hardlinks, always copy
    -s, --shared          setup as shared repository
    --recursive[=<pathspec>]
                          initialize submodules in the clone
    --recurse-submodules[=<pathspec>]
                          initialize submodules in the clone
    -j, --jobs <n>        number of submodules cloned in parallel
    --template <template-directory>
                          directory from which templates will be used
    --reference <repo>    reference repository
    --reference-if-able <repo>
                          reference repository
    --dissociate          use --reference only while cloning
    -o, --origin <name>   use <name> instead of 'origin' to track upstream
    -b, --branch <branch>
                          checkout <branch> instead of the remote's HEAD
    -u, --upload-pack <path>
                          path to git-upload-pack on the remote
    --depth <depth>       create a shallow clone of that depth
    --shallow-since <time>
                          create a shallow clone since a specific time
    --shallow-exclude <revision>
                          deepen history of shallow clone, excluding rev
    --single-branch       clone only one branch, HEAD or --branch
    --no-tags             don't clone any tags, and make later fetches not to follow them
    --shallow-submodules  any cloned submodules will be shallow
    --separate-git-dir <gitdir>
                          separate git dir from working tree
    -c, --config <key=value>
                          set config inside the new repository
    --server-option <server-specific>
                          option to transmit
    -4, --ipv4            use IPv4 addresses only
    -6, --ipv6            use IPv6 addresses only
    --filter <args>       object filtering
    --remote-submodules   any cloned submodules will use their remote-tracking branch
SamYuan1990 commented 1 year ago
yuanyi@yuanyideMacBook-Pro tmp % git clone -b v0.0.0 https://github.com/sustainable-computing-io/local-dev-cluster.git --depth=1
Cloning into 'local-dev-cluster'...
remote: Enumerating objects: 14, done.
remote: Counting objects: 100% (14/14), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 14 (delta 1), reused 8 (delta 0), pack-reused 0
Unpacking objects: 100% (14/14), done.
Note: switching to 'cc8cc366a0f86c891ff867aca914a95af535e2a6'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

yuanyi@yuanyideMacBook-Pro tmp % ls
demo            java-gm         local-dev-cluster   react-express-fullstack tape
yuanyi@yuanyideMacBook-Pro tmp % cd local-dev-cluster 
yuanyi@yuanyideMacBook-Pro local-dev-cluster % ls
LICENSE     README.md   kind        main.sh
yuanyi@yuanyideMacBook-Pro local-dev-cluster % git status
Not currently on any branch.
nothing to commit, working tree clean
yuanyi@yuanyideMacBook-Pro local-dev-cluster % git log -1
commit cc8cc366a0f86c891ff867aca914a95af535e2a6 (grafted, HEAD, tag: v0.0.0)
Author: Huamin Chen <rootfs@users.noreply.github.com>
Date:   Tue May 23 08:04:53 2023 -0400

    Merge pull request #9 from vprashar2929/fix

    Fix flaky common.sh

@jiere , I can't reproduce at my local.... let me know what's your specific case.

jiere commented 1 year ago

@SamYuan1990 , actually you have reproduced:) Please check the code content of your cloned repo, both main.sh and kind/common.sh do not include the change in local-dev-cluster repo's PR#10.

jiere commented 1 year ago

en? https://github.com/sustainable-computing-io/kepler/actions/runs/5134947333/jobs/9239675792 is not breaking? as check out with specific version is git cli feature?

The current job is not broken because it only checks if the make cluster-up result is OK, the current content of cloned local-dev-cluster won't break it. However, if other cases need to leverage the further actions there. i.e. run./main.sh down to tear down the kind cluster, will definitely fail, because the current cloned main.sh still the old one and does not support any options such as up and down.

jiere commented 1 year ago

I am just curious about the current usage here, why hardcode to be -b v0.0.0?

I also checked the branch/tag of local-dev-cluster repo $ git branch -a main remotes/origin/HEAD -> origin/main remotes/origin/main $ git tag v0.0.0 $ git log ...... commit cc8cc366a0f86c891ff867aca914a95af535e2a6 (tag: v0.0.0) Merge: 74fd492 37091d3 Author: Huamin Chen rootfs@users.noreply.github.com Date: Tue May 23 08:04:53 2023 -0400

Merge pull request #9 from vprashar2929/fix

Fix flaky common.sh

......

As above git log shown, the v0.0.0 tag was on 05/23 commit, so all the commits later than that time was missing when using "git clone -b v0.0.0"...

SamYuan1990 commented 1 year ago

@SamYuan1990 , actually you have reproduced:) Please check the code content of your cloned repo, both main.sh and kind/common.sh do not include the change in local-dev-cluster repo's PR#10.

yes I know that. well it's different with git clone -b is an issue.

SamYuan1990 commented 1 year ago

@jiere , TL;DR the check out with the tag version v0.0.0 with the commit id is correct one by design.

Here is some back ground with different options through our community and I try to make a trade off for now as https://github.com/sustainable-computing-io/kepler/blob/main/enhancements/CICDv1.md +@rootfs here. or as you mentioned in https://github.com/sustainable-computing-io/local-dev-cluster/pull/10, Once I blocked it by make kepler consume with tag 1st.

In recently, there are lots of voices about versioning tagging etc. https://github.com/sustainable-computing-io/kepler-helm-chart/pull/15#discussion_r1181237313 https://github.com/sustainable-computing-io/kepler-helm-chart/pull/28#issuecomment-1556043428 from +@bradmccoydev, even if not using tag but specific commit hash, or some requirements from OpenSSF point of view.

So far I would like to use tag.

@jiere , maybe a solution as a trade off is that if we have local-cluster folder, then we skip the git clone. But anyway which is different with git clone -b exists with 1 in bash command, as the make scripts implements in error.

@jiere ,@rootfs , I am not considering with OpenSSF as I mentioned at https://github.com/sustainable-computing-io/kepler/pull/717#issuecomment-1570152691, but open for any discussion. For me, I will review from:

  1. Don't break CI.
  2. Benefits for our life.
bradmccoydev commented 1 year ago

My opinion on this is to do a tag for releases for each repo, and folks can choose to either use the main branch, the tag, or sha. Doing releases provides the options eg 0.0.1, 0.0.2, 0.0.3, etc to stay on that code or rollback if they are not ready for the new code. It doesn't hurt anyone, and then if people don't want to reference the tag, they can simply point to the main branch. Pointing to the main branch is fine in scenarios but giving other people the option to reference versions is an easy thing to do. Happy to talk about it in a meeting if you like, I hope that makes sense

SamYuan1990 commented 1 year ago

My opinion on this is to do a tag for releases for each repo, and folks can choose to either use the main branch, the tag, or sha. Doing releases provides the options eg 0.0.1, 0.0.2, 0.0.3, etc to stay on that code or rollback if they are not ready for the new code. It doesn't hurt anyone, and then if people don't want to reference the tag, they can simply point to the main branch. Pointing to the main branch is fine in scenarios but giving other people the option to reference versions is an easy thing to do. Happy to talk about it in a meeting if you like, I hope that makes sense

I do agree.

jiere commented 1 year ago

@SamYuan1990 and @bradmccoydev , I do like the idea of tag, what I am confusing is that why we hardcode the tag value to v0.0.0 in kepler's Makefile? Since most Kepler developers use local-dev-cluster as their dev setup, so kepler repo is the most important consumer of local-dev-cluster, right? Per my understanding, the fix here won't break any CI also. I do agree that local-dev-cluster itself will not be changed frequently, but if we really need a tag in kepler Makefile's git clone operations, why not use a newer tag? Otherwise, for consumers like me and @vprashar2929 needs to manually(or hardcode in scripts) switch the cloned repo to main(or other tags) to catch up with our expected code each time...

BTW, could you also share the method to switch to main branch or other tags? I just tried failed like this:

$ git branch
** (no branch)
$ git log -1
commit cc8cc366a0f86c891ff867aca914a95af535e2a6 (grafted, HEAD, tag: v0.0.0)
Author: Huamin Chen <rootfs@users.noreply.github.com>
Date:   Tue May 23 08:04:53 2023 -0400

    Merge pull request #9 from vprashar2929/fix

    Fix flaky common.sh
$ git switch -c main
Switched to a new branch 'main'
$ git branch -a
** main
$ git log
commit cc8cc366a0f86c891ff867aca914a95af535e2a6 (grafted, HEAD -> main, tag: v0.0.0)
Author: Huamin Chen <rootfs@users.noreply.github.com>
Date:   Tue May 23 08:04:53 2023 -0400

    Merge pull request #9 from vprashar2929/fix

    Fix flaky common.sh
$ git pull
Already up to date.
rootfs commented 1 year ago

@jiere good point. @SamYuan1990 @bradmccoydev how about creating release for local-dev-cluster the same way as helm chart? Each time the helm chart gets an update, the reversion bumps up as well. In the same manner, the kepler repo will use the newer release from local-dev-cluster to keep up with the changes.

jiere commented 1 year ago

Thanks @rootfs for your suggestion, lgtm:) folks, I also just found that current git clone usage has a weak point: https://github.com/sustainable-computing-io/kepler/blob/main/hack/cluster-up.sh#LL24C88-L24C97, if we use "--depth=1", we miss the original/main branch info, cannot switch back anymore; even though we do not use "--depth=1", the switch back to main should be like this (need altogether three git commands...):

$ git switch -c main
$ git branch --set-upstream-to=origin/main main
$ git pull
SamYuan1990 commented 1 year ago

@jiere good point. @SamYuan1990 @bradmccoydev how about creating release for local-dev-cluster the same way as helm chart? Each time the helm chart gets an update, the reversion bumps up as well. In the same manner, the kepler repo will use the newer release from local-dev-cluster to keep up with the changes.

1st of all, I don't want to force binding, unless CI usage. Here is my point, in kepler's repo, we allow user to use their own local-dev-cluster. We just use a stable version as CI! @jiere , you can make your own fork locally for local development. Or considering some cases below:

Hence, if you want to use cluster down, you can enable at your local or commit a PR to default branch. But before local-dev-cluster version bump up to v0.0.1, or some how, before release 0.6 in kepler, this cluster down feature is something as experiment. That's for release management point of view:

For the tag and commit out of the tag: I don't like to trace the time line as audit... but as the time flow, the pr adding a new parameter as up/down. which I blocked it and tag without it. The PR version as I reviewed. https://github.com/sustainable-computing-io/local-dev-cluster/pull/10/commits/260479453ae557a80fb91bdbc61b24281c27bc62

The comments out as I blocking for tag, as the new parameter been treated as a breaking change. https://github.com/sustainable-computing-io/local-dev-cluster/pull/10#pullrequestreview-1451140383 And I created tags to make kepler consume local-dev-cluster with stable version in CI. That's the reason why this PR out of v0.0.0 scope.

Honestly, I know there is an other option as the second commit as https://github.com/sustainable-computing-io/local-dev-cluster/pull/10/commits/232fce7b1afcf4edb5c36ab5939e223a93280b49 But... we always need a switch for tag.

As a summary, We can't avoid local version of local-cluster-dev for developer usage. As either path issue or experiment features. We just keep CI works with a stable version.

SamYuan1990 commented 1 year ago

I feel like @jiere really want local-dev-cluster with 1st parameter control cluster up or down. like

./main.sh up/down

but in my point of view, the 1st parameter maybe KIND or mircoshift(#182) even if for local dev cluster usage. hence there is another chance for us as below

./main.sh KIND/microshift up/down

which is not finalized... @rootfs , could you please help push #182 ?

vprashar2929 commented 1 year ago

Absolutely @SamYuan1990 I agree with the notion of keeping CI at a stable version as there will always be scope for improvement/changes in local-dev-cluster and it will be hard to keep track of updating its usage dependencies everywhere in future

jiere commented 1 year ago

haha, actually I am preparing scripts for this PR #711 , since they are automation, we must guarantee that the cluster is setup/teardown properly among different cases, this is why I like this make cluster-up/cluster-down command :D

Anyway, I think I totally got your point, Sam. I will try to modify my local scripts to mitigate this. But last thing I want to repeat, could you kindly remove the "--depth=1" then?

jiere commented 1 year ago

but in my point of view, the 1st parameter maybe KIND or mircoshift(#182) even if for local dev cluster usage

May I ask an easy question: is microshift free for use or not? It sounds like an "Edge Computing" version of OpenShift, as we all know that, OpenShift needs license...

SamYuan1990 commented 1 year ago

haha, actually I am preparing scripts for this PR #711 , since they are automation, we must guarantee that the cluster is setup/teardown properly among different cases, this is why I like this make cluster-up/cluster-down command :D

Anyway, I think I totally got your point, Sam. I will try to modify my local scripts to mitigate this. But last thing I want to repeat, could you kindly remove the "--depth=1" then?

For #711 , please take a look at: https://github.com/sustainable-computing-io/kepler/pull/711#pullrequestreview-1457750045

For --depth=1 as a result no for now, just checkout a specific branch with --depth=1 will save a lot of time in CI as it just checkout a single commit.(for performance consideration)

an suggestion is made in #711

SamYuan1990 commented 1 year ago

but in my point of view, the 1st parameter maybe KIND or mircoshift(#182) even if for local dev cluster usage

May I ask an easy question: is microshift free for use or not? It sounds like an "Edge Computing" version of OpenShift, as we all know that, OpenShift needs license...

+@rootfs

vprashar2929 commented 1 year ago

May I ask an easy question: is microshift free for use or not? It sounds like an "Edge Computing" version of OpenShift, as we all know that, OpenShift needs license...

I use Microshift for CI purposes and its free to use.

rootfs commented 1 year ago

@vprashar2929 can you create a microshift env for CI? Thanks

vprashar2929 commented 1 year ago

Sure I am happy to work on this🙂. I suspect we use these steps to deploy Kepler on OCP or we wish something along similar lines with our current integration tests?

rootfs commented 1 year ago

@vprashar2929 manifests will be a good start. Thank you! @husky-parul this could be reused in operator

SamYuan1990 commented 1 year ago

@vprashar2929 manifests will be a good start. Thank you! @husky-parul this could be reused in operator

@rootfs , @vprashar2929 , may suggestion is to add shell script into local-dev-cluster as kepler-action invokes local-dev-cluster to start up k8s cluster for test. then... kepler, operator, helm chart can cover the test case by invoke kepler-action.

vprashar2929 commented 1 year ago

Sure adding microshift cluster start-up/down in local-dev-cluster sounds good.