sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

Packages: decide on ownership structure #39635

Closed olafurpg closed 8 months ago

olafurpg commented 2 years ago

From the handbook on making decisions https://handbook.sourcegraph.com/company-info-and-process/communication/decisions/

Find the right owner. Every decision should be made by the individual who is most directly responsible for the execution and results of the decision.

Currently, the Code Intel team (specifically, Language Tools) is filling in the role of being owners of packages with the goal of shipping packages under the umbrella of cross-repo code navigation. Packages are a critical component for cross-repo navigation so we're comfortable committing to doing the work necessary to make packages work reliably for customer deployments (intuitive admin experience, monitoring, support, etc).

If there is demand for packages outside the umbrella of cross-repo code navigation, such as dependency search, then we should discuss what would be a reasonable ownership story. In that situation, we have a alternative options:

In particular, it's not an option that Language Tools owns package to support other use-cases outside of cross-repo navigation.

varungandhi-src commented 8 months ago

Copying relevant discussion from Slack:

@eseliger

Here’s what I think Source should own (independently of where things live today):

  • The code host syncers for packages that should ensure a package name and their versions are valid, and if so inserts an entry to the repo table
  • The code in gitserver that, given the repo from the above service, knows how to fetch the dependency and their versions and make it a git repo So, given some dataset from SCIP / other data sources in the future, the code host syncer will validate and insert into repo, then gitserver will provide the packages as git repos to Sourcegraph components.

Speaking in interfaces:

type CodeHostSyncer interface {
  SyncExternalService(extsvc int32) error
}

type VCSSyncer interface {
  Clone(repo api.RepoName) error
  Fetch(repo api.RepoName) error
}

type GitserverClient interface {
  // All the known methods we have today will work for packages, as they're "just another git repo"
}

What we will consume:

type PackageRepoSource interface {
  PackagesToConsider(ecosystem string, url string) ([]Package, error)
}
type Package struct {
  Name string
  Versions []string
}

So basically, the CodeHostSyncer needs a data source of which packages+versions it should consider. We can validate they’re ok, and make sure they end up in gitserver for consumption through the GitserverClient interface. Does that match what you’re thinking?

@varungandhi-src

Yep, that makes sense. :+1: Graph team can own extraction of package data from SCIP & other sources like lockfiles