sourcegraph / sourcegraph

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
9.93k stars 1.2k forks source link

🎯 Cloud Adoption - Dev tooling for teammates #60458

Open jhchabran opened 5 months ago

jhchabran commented 5 months ago
# Cloud ephemeral or Deploy my Branch With one command people should be able to deploy their branch of the mono repo and have an environment available they can test against. ## Problem Testing sourcegraph today in a production like environment you don't have much options: * Test on S2 (sourcegraph.sourcegraph.com) * Ask the Cloud team for an environment There are various reasons why one would want to test in a production like setting: * Testing older releases * Testing customer issues * Rapid feature development * Getting a better understanding of how the application performs in a constrained production setting Testing on S2 is also not ideal since it is our internal Sourcegraph instance that is being used by various employees. Thus performing tests on the instance requires coordinating with other engineers since they might also be testing. Furthermore, to be able to test on S2 your changes must have already landed on main. ## Success criteria - N ephemeral deployments w/ explicit confirmation that it was useful - 95% deployment success rate over 1 week - 5 request per week for dev-infra/cloud team to support a cloud deployment ## Proposal Enable users of the `sg` cli tool to deploy their current branch into a cloud environment by issuing the following command `sg cloud deploy --branch | --tag `. Besides the above `deploy` command we also intend to add the following commands and flags to support the deploy functionality * `sg cloud deploy --status --branch ` - get the status of deployment where `` will/is deployed * `sg cloud deploy --list -u ` get a list of deployments that are tied to a GCP username * `sg cloud deploy --destroy --branch ` destroy the environment where `` is deployed * `sg cloud lease --extend --branch | --tag ` extend the lease of the deployed environment, where duration can only be in terms of hours * `sg cloud lease --reduce --branch | --tag ` reduce the lease of the deployed environment, where duration can only be in terms of hours * `sg cloud lease --status --branch | --tag ` returns how much time is left on the deployment before is it removed ### Diagram ![Image](https://github.com/sourcegraph/devx-support/assets/1001709/f3febb0d-778a-4547-a232-109a268254d9) For full and up to date to version see: https://excalidraw.com/#json=xsjTO-Sm-OW83h6gxCESk,V-DkGqs8zv_aD478qgEEgg ### Milestones * Lease mechanism for cloud environments - estimated 2 days * API is available to extend/reduce the lease of a deployed environment * API to query lease time remaining * Cloud controller is able to tear down env where the lease has expired. * RISK: No API available * Able to deploy a tag with `sg cloud deploy --tag ` and access the deployed environment. - estimated 1 day * This is the earliest milestone we can have since the tag images already exists therefore not requiring anything special to be done in the special pipeline * This will also validate our approach in our integration with the Cloud API endpoint * Introduce this feature to the release team * RISK: Troubleshooting deployments * Migration failures * Service startups * Push images as part of build which is triggered via an env var set on the build. - estimated 1 day * Able to deploy a branch with `sg cloud deploy --branch ` and access the deployed environment. - estimated 3 days. * Key component here is that we need to instrument the build to push images and let the Cloud API environment know about the version those images will be tagged as. * Able to list deployed environments by branch / username or all * Introduce this feature to a early-adopter team (see which team will benefit the most) * RISK: Troubleshooting deployments * Migration failures * Service startups * RISK: Not all images might be pused during a build * RISK: Permission issues in either deployed environment or cloud * Create a dashboard to track deployments - estimated 2 days * RISK: The infrastructure cost of running and sustaining different cloud deployments for branches over a month could be high.

Tracked issues

@unassigned

Completed

@burmudar

Completed

@jhchabran

sourcegraph-bot commented 5 months ago

Status Update

Date: 2024-02-15

Overall Status

🟢 On Track

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by jean-hadrien.chabran@sourcegraph.com

sourcegraph-bot commented 5 months ago

Status Update

Date: 2024-02-07

Overall Status

🟢 On Track

Notes

PFP

Blockers/Risks/Concerns

N/A

More Information

Created by eric.shamow@sourcegraph.com

sourcegraph-bot commented 4 months ago

Status Update

Date: 2024-02-19

Overall Status

🟢 On Track

Current: 0

Notes

  1. Started work on CloudAPI for create instance - https://github.com/sourcegraph/cloud-api/pull/6
  2. Agreed on virtual artifact registry for sourcegraph-ci to be used for pulling private images.

Blockers/Risks/Concerns

none

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 4 months ago

Status Update

Date: 2024-02-28

Overall Status

🟢 On Track

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by jean-hadrien.chabran@sourcegraph.com

sourcegraph-bot commented 4 months ago

Status Update

Date: 2024-03-04

Overall Status

🟢 On Track

Current: 0

Notes

  1. Agreed with DevX Team (@william) regarding contract between sg and CloudAPI, to kick-off builds for brand only on-demand initially, and to store images in sourcegraph-ci
  2. Create issue - https://github.com/sourcegraph/customer/issues/2865
  3. Started working on CloudAPI

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 4 months ago

Status Update

Date: 2024-03-12

Overall Status

🟢 On Track

Current: Meeting with Fillip to setup a bi-weekly checkin so that our plans align

Notes

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 4 months ago

Status Update

Date: 2024-03-14

Overall Status

🟢 On Track

Current: 0

Notes

RFC finalising: https://drive.google.com/a/sourcegraph.com/open?id=1b8DgjpMn0J1u9JqyNnRunmDJ5kQjvkqlVKahsTzeTVI Bi-weekly meeting setup to coordinate work CloudAPI initial design under review - https://sourcegraph.slack.com/archives/C06JENN2QBF/p1710341302493319 Initial implementation is on-going, PRs:

Blockers/Risks/Concerns

N/A

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-03-22

Overall Status

🟢 On Track

Current: cloud-ephemeral runtype added which pushes branches to private registry for cloud

Notes

Started with implementation to of sg cloud deploy

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-03-25

Overall Status

🟢 On Track

Current: 0

Notes

Agreed with @Michael on implementation details, including. always put CR into control-plane with first create request to use this a locking mechanism. Merged PRs:

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-04-02

Overall Status

🟢 On Track

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-03-25

Overall Status

🟢 On Track

Current: 0

Notes

Agreed with @Michael on implementation details, including. always put CR into control-plane with first create request to use this a locking mechanism. Merged PRs:

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-04-04

Overall Status

🟢 On Track

Current: 0

Notes

Initial version with create ephemeral instance via CloudAPI and client in mi2 works - https://sourcegraph.slack.com/archives/C06JENN2QBF/p1711444342824119, waiting for review. CloudAPI design waiting for review. Next steps:

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-04-08

Overall Status

🟢 On Track

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-04-08

Overall Status

🟢 On Track

Current: Started with the sg cloud deploy command impl

Notes

Got most of what is required to make a connection to cloud. Next steps are:

  1. Connect to cloud API with correct credentials
  2. Start a deployment

Have a sync with Filip tomorrow and will then find out more what is required for the token. So far it seems that I would need to impersonate a Service Account.

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 3 months ago

Status Update

Date: 2024-04-15

Overall Status

🟢 On Track

Current: 0

Notes

Initial version is exposed as msp service: https://sourcegraph.slack.com/archives/C06JENN2QBF/p171295552278669 Contains commands (deploy, extend-lease, delete). Waiting for sg integration from DevX Team and feedback. This week:

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 2 months ago

Status Update

Date: 2024-04-19

Overall Status

🟢 On Track

Current: Commands implemented to deploy and list environments - still needs to be merged

Notes

Listing of instances still needs work on the backend since you can't filter and thus it lists all instances

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 2 months ago

Status Update

Date: 2024-04-27

Overall Status

🟢 On Track

Current: 0

Notes

Huge progress since last update. Sg cloud deploy works e2e thanks to @William Bezuidenhout! CloudAPI interface and backend code in review. Cloud upgrade/extend-lease/delete in progress. More in wg: https://sourcegraph.slack.com/archives/C06JENN2QBF/p1714126143153829

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 2 months ago

Status Update

Date: 2024-04-29

Overall Status

🟢 On Track

Current: End-to-end deployment working. Adding auxilary commands like delete, lease etc.

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 2 months ago

Status Update

Date: 2024-05-08

Overall Status

🟢 On Track

Current: Implementing last command to trigger an upgrade of an instance

Notes

Need to also ensure as part of the release that images are pushed to the release registry

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 2 months ago

Status Update

Date: 2024-05-15

Overall Status

🟢 On Track

Current: 0.75

Notes

Flow works e2e (create/upgrade/extend-lease/delete via expiring lease): https://sourcegraph.slack.com/archives/C06JENN2QBF/p1715160044264479 Delete expired instances works - https://github.com/sourcegraph/managed-services/pull/1351 TODO:

Blockers/Risks/Concerns

n/a

More Information

Created by filip.haftek@sourcegraph.com

sourcegraph-bot commented 1 month ago

Status Update

Date: 2024-05-20

Overall Status

STATUS IS UNKNOWN Strong Progress

Current: All commands implemented. Need to add documentation on how to use it

Notes

N/A

Blockers/Risks/Concerns

N/A

More Information

Created by william.bezuidenhout@sourcegraph.com

sourcegraph-bot commented 1 month ago

Status Update

Date: 2024-06-03

Overall Status

🟢 On Track

Current: 0.9

Notes

Working on last fixes for status during creation and using global tag instead of sha for semver images, final PR planned this week.

Blockers/Risks/Concerns

N/A

More Information

Created by filip.haftek@sourcegraph.com

jhchabran commented 1 week ago

@burmudar can we kill this?