sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

Extension API should expose information about the code host, raw repo name, revision and file path without URI parsing #4278

Closed lguychard closed 9 months ago

lguychard commented 5 years ago

See https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/blob/web/src/repo/blob/Blob.tsx#L324:30

As an example, when visiting https://sourcegraph.sgdev.org/sourcegraph/sourcegraph/-/blob/cmd/gitserver/server/cleanup.go, the resource URI passed by the extension API includes the repository name generated according to repositoryPathPattern, not the repository URI:

"git://sourcegraph/sourcegraph?33b4d1e0d26e9a9c03ea6c5a534f3e11e7dfc3ee#cmd/gitserver/server/cleanup.go"

By contrast, the browser extension generates the following URI:

"git://github.com/sourcegraph/sourcegraph?33b4d1e0d26e9a9c03ea6c5a534f3e11e7dfc3ee#cmd/gitserver/server/cleanup.go"

This is an issue when extensions expect to be able to parse a repo URI from a resource URI, like Codecov does:

image

lguychard commented 5 years ago

Note: I don't think extensions parsing resource URIs is actually a good thing. Instead, extensions should be able to obtain metadata about a given resource (repo name, uri, rev, ...) through the extension API.

felixfbecker commented 5 years ago

But shouldn't we still have some predictable URI format? That currently is supposed to be git://repoName?rev#filePath (however I never really liked that because the file path is not part of the path preventing relative resolution, a RAW API URL would be better long-term).

The browser extension just currently makes assumptions about address URL -> repoPath mapping in code-host specific code that are not generally true. Shouldn't we fix those instead?

lguychard commented 5 years ago

Yes, it would be okay if we did the following:

felixfbecker commented 5 years ago

What do you mean with "repo URI"?

lguychard commented 5 years ago

I meant the Repository.uri GraphQL field, but just saw that that field was deprecated:

image

The need here is to get info about the external service type, which will anyway be better accomplished with externalRepository:

image

More context if this wasn't clear in the original bug: on sourcegraph.sgdev.org, the repoName for sourcegraph/sourcegraph as affected by repositoryPathPattern is sourcegraph/sourcegraph, which gives no information about the code host. The Codecov extension needs to know what the code host for a given repository is, to check if it is supported, and to reference it in API urls. It currently assumes that the code host hostname is present in resource URIs, which is not a correct assumption.

felixfbecker commented 5 years ago

Just to be clear, the old repository.uri field was renamed to repository.name (because it isn't a URI and was therefor confusing). So the correct term for this string is "repo name". The repo name is an arbitrary opaque string that generally exposes no information about the code host. The URIs in the extension APIs include the repo name, however the general structure of the URI would best also not be assumed by extensions.

One problem I see is that in the browser extension, we may not even have a Sourcegraph instance with the repo on it, so we:

slimsag commented 5 years ago

If you have a clone URL (which I assume would be easy to get given you are on the code host) the backend can lookup the Sourcegraph repository name for you assuming an external service configuration for that code host exists: https://sourcegraph.com/api/console#%7B%22query%22%3A%22%7B%5Cn%20%20repository(cloneURL%3A%20%5C%22https%3A%2F%2Fgithub.com%2Fgorilla%2Fmux%5C%22)%20%7B%5Cn%20%20%20%20id%5Cn%20%20%20%20name%5Cn%20%20%7D%5Cn%7D%5Cn%22%7D

felixfbecker commented 5 years ago

@lguychard is this issue still relevant? If yes, can you rename the issue title so it makes sense?

felixfbecker commented 4 years ago

Should also expose the revision (not just the commit ID as it is in the URI at the moment), so extensions can build pretty links (relevant for code insights)

felixfbecker commented 3 years ago

The way I'm thinking about this is similar to VS Code, which has an scm namespace. A repository is somewhat tied to the workspace, but really any folder can be a git repository. Any file inside Sourcegraph can basically part of a repository, but doesn't have to be (it's not unlikely that we'll have to support arbitrary, out-of-repository files for extensions and code intel, e.g. campaign patches). The most logical API would probably be one that returns a RepositoryInfo object given a file URI for the closest Git repository it is contained in. That RepositoryInfo object could then contain another object with information about the "upstream" of the repository too, e.g. the service type (github, gitlab etc). One question to answer is whether repositories and revisions are tied together (RepositoryInfo just has a revision property), like in a local Git checkout, or whether we want to decouple them (we return a RevisionInfo that has a property that is a RepositoryInfo).

Quick example idea of an API:

namespace workspace {
  // or should this be in its own namespace?
  function getVersionControlInfo(resource: URL): RevisionInfo | undefined

  interface RevisionInfo {
    commitID: string
    revision: string
    repository: RepositoryInfo
  }

  interface RepositoryInfo {
    root: URL
    name: string
    mirrorInfo?: RepositoryMirrorInfo // similar to the GraphQL type
  }
}
felixfbecker commented 3 years ago

@sourcegraph/web looking for thoughts on the API here (proposal above), wdyt?

tjkandala commented 3 years ago

Do we have any reasons to decouple RevisionInfo and RepositoryInfo? Having to go through revision.repository.mirrorInfo to get info like the service type feels like unnecessary nesting

felixfbecker commented 3 years ago

I was thinking about the case where we view a diff, e.g. on GitHub. There we have two text documents for the same file at different revisions (head and base), but they point to the same repository. If we separate the types, we can represent the different revisions, but keep a.repository === b.repository.

Something to also think about: How would we represent the case of extensions running on a repo that is not available on Sourcegraph? E.g. with the browser extension. We could have information (revision and "mirror info") from the code host, but not the Sourcegraph repository name. So maybe "mirror info" is actually a bad name and the nesting of MirrorInfo and RepositoryInfo should be inverted (with a "sourcegraph info" that can be undefined containing the resolved repoName). @marekweb would love to hear your ideas too!

github-actions[bot] commented 3 years ago

Heads up @joelkw @felixfbecker - the "team/extensibility" label was applied to this issue.