sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.11k stars 1.29k forks source link

Repo name mappings are not universally supported #462

Closed sqs closed 5 years ago

sqs commented 6 years ago

Current state:

Properties of solution:

Useful initial decisions:

Related:

Customers (not exhaustive):

nicksnyder commented 5 years ago

I don't think this should be prioritized for 3.1.

  1. It feels like less of a priority than https://github.com/sourcegraph/sourcegraph/issues/2025
  2. It isn't clear to me what the solution is
  3. It isn't clear to me that this can be solved by 3.1 (i.e. this week).
tsenart commented 5 years ago
  1. It isn't clear to me that this can be solved by 3.1 (i.e. this week).

Thanks for the estimation. Let's postpone it then.

keegancsmith commented 5 years ago

Do we have any customers relying on this feature? Is there potential for removing this feature and maybe adding it back in another way. For example we have been thinking about implications of #2025, and one of our mentioned solutions is aliases instead of user controlled names in the repo table. IE multiple names mapping to a repo, instead of a unique name.

nicksnyder commented 5 years ago

I believe that we do have customers who use this.

Personally, I like the idea of always having a full "canonical" name, and then allowing sites to configure a (single) alias.

When an alias is configured, both should work, but I would probably expect the full canonical name to redirect to the alias.

dobrou commented 5 years ago

Hi, I'd like to vote for fixing this issue.

Better said, to fix browser extension integration when "repositoryPathPattern" is configured to non-default pattern on sourcegraph.

In bitbucket configuration, we define: "repositoryPathPattern": "{projectKey}/{repositorySlug}" Reason is that we have only one source code server, so presense of it's url in repo id is useless. While we have many repositories (100+) and many searches returns results from 10+ repositories.

So BIG advantages of custom repo id pattern are:

However we really miss the browser extension integration with bitbucket server, caused by "bug" described here.

As I read the 'alias' proposal by @nicksnyder , this would work perfectly in our case.

Thank you

sqs commented 5 years ago

Thank you, @Dobrou. This is the 2nd highest priority issue after we ship #2025 and follow up on any other directly related tasks from that. We will have more details and (likely) a timeline next week after we discuss it.

keegancsmith commented 5 years ago

I have in mind a PoC I'd like to experiment with. It is inspired by the updated description from @sqs. It will both help me understand the scope of a solution and is quick to implement (probably an hour to implement, full day of testing our different clients / use cases).

PoC Proposal

  1. Put the name without repositoryPathPattern into the unused uri column. This will be done via the syncer.
  2. Only update db.Repos.Get() to fallback looking up by uri iff looking up by name fails. In the case lookup via uri works, return a redirect error to name.

I'm unsure what would break, but on the face of it this would work quite well. I believe some of our language servers make assumptions on the name. In that case they would still fail, but possibly we can include uri as part of the repo API.

Potentially uri would have a uniqueness constraint and the syncer handles renames/deletions on it like we do for name. However, I will avoid this in the PoC and just deterministically pick a winner in the case of multiple rows with the same uri. (In theory this shouldn't happen thanks to constraints on the external_* columns).

@sqs, @tsenart: Can you poke holes into this idea in two ways? Does this solution match up with the constraints laid out? What technically would not work?

sqs commented 5 years ago

I believe some of our language servers make assumptions on the name. In that case they would still fail, but possibly we can include uri as part of the repo API.

What kinds of assumptions do you have in mind, and what failure case are you thinking of? Can you give a specific example to help me understand? Like Go import paths being (by default) the repo name?

I don't have a good understanding of the technical details anymore, but the high-level idea of falling back to the "fully qualified" name and returning a redirect in that case makes sense and sounds like it would work.

keegancsmith commented 5 years ago

@sqs

If the site admin changes how repositories are named, it must not leave a bunch of stray repositories around with the old naming scheme.

FYI this is already supported in 3.3 release due to the changes in #2025

Every instance in code where we check for a github.com/ string prefix to a repo name should probably be fixed.

Most instances of this will be solved by the RepoLookup cleanup work we plan to do in 3.4.

Next Steps: Write down a test plan to validate PoC. I'll implement it today and leave it in a state where @tsenart could possibly pick it up when I go for vacation.

keegancsmith commented 5 years ago

This is in master and will go out in the 3.4 release. I've been testing extensions with it and it seems to be working well. There are likely some corner cases were extensions do not work (noticeably the fallback root package path detection may fail in go-langserver, but this can be worked around). However, all those corner cases failed in previous version of sourcegraph as well. So on the whole this should be much improved.