sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

Store `indexed` alongside `cloned` in `repo` #14764

Closed flying-robot closed 8 months ago

flying-robot commented 4 years ago

As I understand it, zoekt is the source of truth for indexing status, and git-server is the source of truth for clone status. We're replicating the cloned status into the repo table as shown below, but not the indexed status:

localhost sourcegraph@sourcegraph=# \d repo
                                             Table "public.repo"
┌───────────────────────┬──────────────────────────┬───────────┬──────────┬──────────────────────────────────┐
│        Column         │           Type           │ Collation │ Nullable │             Default              │
├───────────────────────┼──────────────────────────┼───────────┼──────────┼──────────────────────────────────┤
│ id                    │ integer                  │           │ not null │ nextval('repo_id_seq'::regclass) │
│ name                  │ citext                   │           │ not null │                                  │
│ description           │ text                     │           │          │                                  │
│ language              │ text                     │           │          │                                  │
│ fork                  │ boolean                  │           │          │                                  │
│ created_at            │ timestamp with time zone │           │ not null │ now()                            │
│ updated_at            │ timestamp with time zone │           │          │                                  │
│ external_id           │ text                     │           │          │                                  │
│ external_service_type │ text                     │           │          │                                  │
│ external_service_id   │ text                     │           │          │                                  │
│ archived              │ boolean                  │           │ not null │ false                            │
│ uri                   │ citext                   │           │          │                                  │
│ deleted_at            │ timestamp with time zone │           │          │                                  │
│ metadata              │ jsonb                    │           │ not null │ '{}'::jsonb                      │
│ private               │ boolean                  │           │ not null │ false                            │
│ cloned                │ boolean                  │           │ not null │ false                            │
└───────────────────────┴──────────────────────────┴───────────┴──────────┴──────────────────────────────────┘

I think having both attributes manifested within the table would allow us to simplify the repositories resolver and handle everything in a simple SQL statement.

cc @tsenart

github-actions[bot] commented 4 years ago

Heads up @tsenart - the "team/cloud" label was applied to this issue.

tsenart commented 4 years ago

@flying-robot: Tentatively adding to next milestone, we can take it out if we see it's too much, doesn't fit.

ryanslade commented 2 years ago

@tsenart I'm sorting through some old issues and saw this.

We now store the clone status in gitserver_repos. Do you think we still need this issue?

tsenart commented 2 years ago

Yeah, this would still help efficiently looking up which revisions are indexed and which aren't. Important for search code, so we don't have to keep a big set of indexed repos in memory and poll from Zoekt.