Where to host cs.k8s.io

ameukam commented 3 years ago

https://cs.k8s.io is running on a baremetal server provided by Equinix Metal(ex Packet) under CNCF budget and operated until now by @dims.

The question was asked about whether or not we should host CodeSearch on aaa cluster.

Ref: https://kubernetes.slack.com/archives/CCK68P2Q2/p1615204807111900?thread_ts=1615189697.108500&cid=CCK68P2Q2

Issue open to track the discussions and the consensus about this.

nikhita commented 3 years ago

@dims where is the original source code for cs.k8s.io? :eyes:

nikhita commented 3 years ago

/wg k8s-infra

ameukam commented 3 years ago

/sig contributor-experience /priority backlog

/assign @spiffxp cc @mrbobbytables @alisondy @cblecker @munnerz

ameukam commented 3 years ago

@dims where is the original source code for cs.k8s.io? eyes

@nikhita You can find the config here https://github.com/dims/k8s-code.appspot.com/

BenTheElder commented 3 years ago

What's the argument against hosting it on AAA?

dims commented 3 years ago

@BenTheElder nothing other than someone has to do it :) oh, i don't know how to wire the ingress/dns stuff

i tried a long time ago :) https://github.com/kubernetes/k8s.io/pull/96

ameukam commented 3 years ago

What's the argument against hosting it on AAA?

I would say lack of artifact destined for aaa (aka no up-to-date container image for hound). We could host the image on k8s-staging-infra-tools.

nikhita commented 3 years ago

@ameukam should this issue be migrated to the k/k8s.io repo?

ameukam commented 3 years ago

@nikhita I'm not sure about the right place of this issue. Just wanted to put this under SIG Contribex TLs and Chairs radar.

BenTheElder commented 3 years ago

it should be under k/k8s.io imho. I think we should host it on AAA fwiw.

nikhita commented 3 years ago

Moving to k8s.io repo. slack discussion - https://kubernetes.slack.com/archives/CCK68P2Q2/p1623300972130500

spiffxp commented 3 years ago

/sig contributor-experience /wg k8s-infra

jimdaga commented 3 years ago

I took a stab at onboarding codesearch; @spiffxp could I get your input? I want to make sure I didn't miss anything. I want to stage all the infra, and get it deployed via prow first. Then we can follow up with another PR to cut-over DNS when we are ready.

https://github.com/kubernetes/k8s.io/pull/2513 https://github.com/kubernetes/test-infra/pull/23201

I could also work on adding the docker build logic after, but I haven't worked in that repo yet so I'll have to do some digging.

cc @dims

spiffxp commented 3 years ago

/priority important-soon /milestone v1.23

justaugustus commented 3 years ago

What about using https://sourcegraph.com/kubernetes to minimize the maintenance burden here? This is something I suggested to @dims in the past, but didn't have the bandwidth to do at the time.

dims commented 3 years ago

choices are:

leave things where they are
move to k8s wg infra
redirect to cs.k8s.io to sourcegraph

i have been taking care of 1 already for a while with minimal downtime, so i am ok with continuing to do so
if someone wants to do 2, i am happy to work with help, show how things are setup and we can shut down the equinix vm
i personally don't like option 3, i love the the hound UX, if the consensus is we should go with 3, that is fine with me. I am happy to run a personal instance on a custom domain for myself (community is welcome to use)

if i missed any other options, please feel free to chime in.

spiffxp commented 3 years ago

/unassign

jimdaga commented 3 years ago

FYI: If choice 2 is picked, my two PRs are pretty much ready to stage codesearch in the aaa cluster. There are a few small things that need to happen after the merge, but that's documented in my PRs.

dims commented 3 years ago

thanks @jimdaga

+1 to give #2 a shot. will let Aaron and Arnaud to review and merge all 3 PRs

ameukam commented 2 years ago

/milestone v1.24

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

ameukam commented 2 years ago

/remove-lifecycle stale

nikhita commented 2 years ago

@ameukam what is remaining here?

ameukam commented 2 years ago

@ameukam what is remaining here?

Deploy a canary instance from https://github.com/kubernetes/k8s.io/pull/2513. Once we have confidence with that instance we can rollout a prod instance.

Priyankasaggu11929 commented 2 years ago

/assign

@nikhita, I'm interested in helping with setting up a canary instance.

Priyankasaggu11929 commented 2 years ago

Post-merge checklist item from PR https://github.com/kubernetes/k8s.io/pull/2513 that need working on:

[ ] Publish cs-fetch-repo docker image (Open PR: https://github.com/kubernetes/test-infra/pull/25576)
[ ] Update deployment to use deployed docker image (using a temp image for now)
[X] After testing is completed, cutover DNS to new K8s hosted IP. (Done by https://github.com/kubernetes/k8s.io/pull/3416)

@pmgk07, once we're done with having https://github.com/kubernetes/test-infra/pull/25576 merged for adding cs-fetch-repos image under k8s infra, the next step would be updating the codesearch/deployment.yaml#L27 to use above hosted image.

ameukam commented 2 years ago

Update deployment to use deployed docker image (using a temp image for now)

@Priyankasaggu11929 Let's give @jimdaga the final call about this. There are possible changes that need to be added the Docker image.

jimdaga commented 2 years ago

Now that https://github.com/kubernetes/k8s.io/pull/3492 is merged, I see codesearch is deployed in the cluster!

However, it looks like the init containers are crashing:

kubectl get pods -n codesearch
NAME                         READY   STATUS                  RESTARTS   AGE
codesearch-5b975d449-lgm9b   0/1     Init:CrashLoopBackOff   8          19m
codesearch-5b975d449-zzqkl   0/1     Init:CrashLoopBackOff   8          19m

I'm out of the office right now, so I can't do a full debug. But it does seem like something needs fixing :( (I also don't have access to view pod logs, so not sure how to get that)

Priyankasaggu11929 commented 2 years ago

Let's give @jimdaga the final call about this. There are possible changes that need to be added the Docker image.

+1. Yes 🙂

There's also an error for decoding ingress in the build-logs of the post-k8sio-deploy-app-codesearch job.

I've raised a minor patch fix: https://github.com/kubernetes/k8s.io/pull/3502

ameukam commented 2 years ago

Now that #3492 is merged, I see codesearch is deployed in the cluster!

However, it looks like the init containers are crashing:
kubectl get pods -n codesearch
NAME                         READY   STATUS                  RESTARTS   AGE
codesearch-5b975d449-lgm9b   0/1     Init:CrashLoopBackOff   8          19m
codesearch-5b975d449-zzqkl   0/1     Init:CrashLoopBackOff   8          19m
I'm out of the office right now, so I can't do a full debug. But it does seem like something needs fixing :( (I also don't have access to view pod logs, so not sure how to get that)

You can use GCP Logging console for the logs: https://console.cloud.google.com/logs/query;query=resource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22codesearch%22;cursorTimestamp=2022-03-11T06:20:53.646489047Z?project=kubernetes-public.

I did a quick research based on the logs and it suggested the issue may be related to the architecture of the Docker image.

 skopeo inspect docker://jdagostino2/codesearch-fetch:0.1.7 | jq .Architecture
"arm64"

The image seems to be built using a arm64 processor but the GKE nodes are amd64. We should try to switch to gcr.io/k8s-staging-infra-tools and see what's happening.

pmgk07 commented 2 years ago

@ameukam I unknowingly added fixes #xyz in my PR which led the k8s-ci-robot to close this issue. Feel free to reopen this issue if there's anything pending.

ameukam commented 2 years ago

/reopen

k8s-ci-robot commented 2 years ago

@ameukam: Reopened this issue.

In response to [this](https://github.com/kubernetes/k8s.io/issues/2182#issuecomment-1070826831): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

ameukam commented 2 years ago

https://cs-canary.k8s.io is now up and running. We should spread the word about his existence and think about we will flip cs.k8s.io to aaa GKE cluster

dims commented 2 years ago

sounds like a great plan @ameukam ! we can flip ASAP. i will leave the other one running for a week or two just in case we have a problem

jimdaga commented 2 years ago

The one last change we need before we flip is a job that restarts the deployment nightly to pick up any changes. The way I set up the deployment there should be no downtime while the new pods are coming up.

jimdaga commented 2 years ago

Hopefully have the final two PRs needed to consider this "go-live" ready.

Looking for an LGTM for these two:

ameukam commented 2 years ago

/milestone v1.25

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

nikhita commented 2 years ago

/remove-lifecycle stale

@jimdaga do you have cycles to address the review on https://github.com/kubernetes/k8s.io/pull/3679?

ameukam commented 2 years ago

/milestone v.126

k8s-ci-robot commented 2 years ago

@ameukam: The provided milestone is not valid for this repository. Milestones in this repository: [v1.24, v1.25, v1.26]

Use /milestone clear to clear the milestone.

In response to [this](https://github.com/kubernetes/k8s.io/issues/2182#issuecomment-1221128296): >/milestone v.126 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

ameukam commented 2 years ago

/milestone v1.26

upodroid commented 1 year ago

Can we explore deprecating this in favour of GitHub Code Search?

https://cs.github.com/ https://github.com/features/code-search

ameukam commented 1 year ago

Not really. one issue is that Github CS requires auth to use it while we have anonymous queries using cs.k8s.io (e.g. https://go.k8s.io/owners/dims).

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

BenTheElder commented 1 year ago

/lifecycle frozen closing this is not helpful, unless we've also shut down the existing infra.

ameukam commented 7 months ago

/assign @SohamChakraborty

k8s-ci-robot commented 7 months ago

@ameukam: GitHub didn't allow me to assign the following users: SohamChakraborty.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to [this](https://github.com/kubernetes/k8s.io/issues/2182#issuecomment-2063191523): >/assign @SohamChakraborty Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

SohamChakraborty commented 7 months ago

I think this is now ready for migration from the bare metal server to aaa cluster. I spoke with Arnaud and he will decide on a path for migration.

kubernetes / k8s.io

Where to host cs.k8s.io #2182