sourcegraph / sourcegraph

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.03k stars 1.21k forks source link

Codeinsights Schema drift - v3.33 -> v5.0.1 #51224

Open DaedalusG opened 1 year ago

DaedalusG commented 1 year ago

Steps to reproduce:

  1. Start an instance of Sourcegraph on an early version (v3.33 docker-compose in this case)
  2. Using multi-version upgrade, upgrade to v5.0.x (v5.0.2 in this case)

Result

On a totally fresh instance once the upgrade is completed you shouldn't see any drift -- in reality you may encounter the following drift

Screenshot 2023-04-27 at 11 12 39 AM
``` ❌ Schema drift detected for codeinsights ℹ️ Locating schema description ℹ️ Reading schema definition in Local file (/schema-descriptions/v5.0.2-internal_database_schema.codeinsights.json)... Schema not found (open /schema-descriptions/v5.0.2-internal_database_schema.codeinsights.json: no such file or directory). Will attempt a fallback source. ✅ Schema found in GitHub (https://raw.githubusercontent.com/sourcegraph/sourcegraph/v5.0.2/internal/database/schema.codeinsights.json). ❌ Drift detected! ❌ Missing constraint "series_points"."series_points_metadata_id_fkey" 💡 Suggested action: define the constraint. ```sql ALTER TABLE series_points ADD CONSTRAINT series_points_metadata_id_fkey FOREIGN KEY (metadata_id) REFERENCES metadata(id) ON DELETE CASCADE DEFERRABLE; ``` ❌ Missing constraint "series_points"."series_points_original_repo_name_id_fkey" 💡 Suggested action: define the constraint. ```sql ALTER TABLE series_points ADD CONSTRAINT series_points_original_repo_name_id_fkey FOREIGN KEY (original_repo_name_id) REFERENCES repo_names(id) ON DELETE CASCADE DEFERRABLE; ``` ❌ Missing constraint "series_points"."series_points_repo_name_id_fkey" 💡 Suggested action: define the constraint. ```sql ALTER TABLE series_points ADD CONSTRAINT series_points_repo_name_id_fkey FOREIGN KEY (repo_name_id) REFERENCES repo_names(id) ON DELETE CASCADE DEFERRABLE; ``` ❌ Missing index "series_points"."series_points_original_repo_name_id_btree" 💡 Suggested action: define the index. ```sql CREATE INDEX series_points_original_repo_name_id_btree ON series_points USING btree (original_repo_name_id); ``` ❌ Missing index "series_points"."series_points_repo_id_btree" 💡 Suggested action: define the index. ```sql CREATE INDEX series_points_repo_id_btree ON series_points USING btree (repo_id); ``` ❌ Missing index "series_points"."series_points_repo_name_id_btree" 💡 Suggested action: define the index. ```sql CREATE INDEX series_points_repo_name_id_btree ON series_points USING btree (repo_name_id); ``` ❌ Missing index "series_points"."series_points_series_id_btree" 💡 Suggested action: define the index. ```sql CREATE INDEX series_points_series_id_btree ON series_points USING btree (series_id); ``` ❌ Missing index "series_points"."series_points_series_id_repo_id_time_idx" 💡 Suggested action: define the index. ```sql CREATE INDEX series_points_series_id_repo_id_time_idx ON series_points USING btree (series_id, repo_id, "time"); ``` ❌ Unexpected index "series_points"."series_points_vanilla_original_repo_name_id_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_original_repo_name_id_idx; ``` ❌ Unexpected index "series_points"."series_points_vanilla_repo_id_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_repo_id_idx; ``` ❌ Unexpected index "series_points"."series_points_vanilla_repo_name_id_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_repo_name_id_idx; ``` ❌ Unexpected index "series_points"."series_points_vanilla_series_id_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_series_id_idx; ``` ❌ Unexpected index "series_points"."series_points_vanilla_series_id_repo_id_time_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_series_id_repo_id_time_idx; ``` ❌ Unexpected index "series_points"."series_points_vanilla_time_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_vanilla_time_idx; ``` ❌ Unexpected comment of table "series_points" 💡 Suggested action: change the table comment. ```sql COMMENT ON TABLE series_points IS 'Records events over time associated with a repository (or none, i.e. globally) where a single numerical value is going arbitrarily up and down. Repository association is based on both repository ID and name. The ID can be used to refer toa specific repository, or lookup the current name of a repository after it has been e.g. renamed. The name can be used to refer to the name of the repository at the time of the events creation, for example to trace the change in a gauge back to a repository being renamed.'; ``` ❌ Unexpected index "series_points_snapshots"."series_points_snapshots_time_idx" 💡 Suggested action: drop the index. ```sql DROP INDEX series_points_snapshots_time_idx; ``` 💡 Before continuing with this operation, run the migrator's drift command and follow instructions to repair the schema to the expected current state. See https://docs.sourcegraph.com/admin/how-to/manual_database_migrations#drift for additional instructions. ```

If you would like immediate help on this, please email support@sourcegraph.com (you can still create the issue, but there are no SLAs on issues like there are for support requests).

DaedalusG commented 1 year ago

Without the suggested sql here is the drift observed:

❌ Schema drift detected for codeinsights
❌ Drift detected!
❌ Missing constraint "series_points"."series_points_metadata_id_fkey"
❌ Missing constraint "series_points"."series_points_original_repo_name_id_fkey"
❌ Missing constraint "series_points"."series_points_repo_name_id_fkey"
❌ Missing index "series_points"."series_points_original_repo_name_id_btree"
❌ Missing index "series_points"."series_points_repo_id_btree"
❌ Missing index "series_points"."series_points_repo_name_id_btree"
❌ Missing index "series_points"."series_points_series_id_btree"
❌ Missing index "series_points"."series_points_series_id_repo_id_time_idx"
❌ Unexpected index "series_points"."series_points_vanilla_original_repo_name_id_idx"
❌ Unexpected index "series_points"."series_points_vanilla_repo_id_idx"
❌ Unexpected index "series_points"."series_points_vanilla_repo_name_id_idx"
❌ Unexpected index "series_points"."series_points_vanilla_series_id_idx"
❌ Unexpected index "series_points"."series_points_vanilla_series_id_repo_id_time_idx"
❌ Unexpected index "series_points"."series_points_vanilla_time_idx"
❌ Unexpected comment of table "series_points"
❌ Unexpected index "series_points_snapshots"."series_points_snapshots_time_idx"

The schema changes missed appear to be related to the deprecation of the timescale-db postgres instance and move to a regular postgres instance (See https://github.com/sourcegraph/sourcegraph/pull/30781/files). This suggests that there may be some squashing issue when a multversion upgrade is made across related version ranges.

This shouldn't be an issue for users and the suggested sql toy fix this can be run in the database to resolve this issue and clear up the drift.

DaedalusG commented 1 year ago

Noting for debugging that the linked migrations in the PR above appear to be in the migration_logs table on the instance in which I've reproduced this drift

postgres=# select * from migration_logs;
 id | migration_logs_schema_version |             schema             |   version   | up |          started_at           |          finished_at          | success | error_message | backfilled
----+-------------------------------+--------------------------------+-------------+----+-------------------------------+-------------------------------+---------+---------------+------------
  1 |                             2 | codeinsights_schema_migrations | -1000000000 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  2 |                             2 | codeinsights_schema_migrations |  1000000000 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  3 |                             2 | codeinsights_schema_migrations |  1000000001 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  4 |                             2 | codeinsights_schema_migrations |  1000000002 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  5 |                             2 | codeinsights_schema_migrations |  1000000003 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  6 |                             2 | codeinsights_schema_migrations |  1000000004 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  7 |                             2 | codeinsights_schema_migrations |  1000000005 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  8 |                             2 | codeinsights_schema_migrations |  1000000006 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
  9 |                             2 | codeinsights_schema_migrations |  1000000007 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 10 |                             2 | codeinsights_schema_migrations |  1000000008 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 11 |                             2 | codeinsights_schema_migrations |  1000000009 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 12 |                             2 | codeinsights_schema_migrations |  1000000010 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 13 |                             2 | codeinsights_schema_migrations |  1000000011 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 14 |                             2 | codeinsights_schema_migrations |  1000000012 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 15 |                             2 | codeinsights_schema_migrations |  1000000013 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 16 |                             2 | codeinsights_schema_migrations |  1000000014 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 17 |                             2 | codeinsights_schema_migrations |  1000000015 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 18 |                             2 | codeinsights_schema_migrations |  1000000016 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
 19 |                             2 | codeinsights_schema_migrations |  1000000017 | t  | 2023-04-25 01:02:41.865389+00 | 2023-04-25 01:02:41.871779+00 | t       |               | f
 20 |                             2 | codeinsights_schema_migrations |  1000000018 | t  | 2023-04-25 01:02:41.872396+00 | 2023-04-25 01:02:41.874962+00 | t       |               | f
 21 |                             2 | codeinsights_schema_migrations |  1000000019 | t  | 2023-04-25 01:02:41.875425+00 | 2023-04-25 01:02:41.877368+00 | t       |               | f
 22 |                             2 | codeinsights_schema_migrations |  1000000020 | t  | 2023-04-25 01:02:41.87784+00  | 2023-04-25 01:02:41.884906+00 | t       |               | f
 23 |                             2 | codeinsights_schema_migrations |  1000000021 | t  | 2023-04-25 01:02:41.885414+00 | 2023-04-25 01:02:41.888549+00 | t       |               | f
 24 |                             2 | codeinsights_schema_migrations |  1000000022 | t  | 2023-04-25 01:02:41.888999+00 | 2023-04-25 01:02:41.928926+00 | t       |               | f
 25 |                             2 | codeinsights_schema_migrations |  1000000023 | t  | 2023-04-25 01:02:41.929605+00 | 2023-04-25 01:02:41.931622+00 | t       |               | f
 26 |                             2 | codeinsights_schema_migrations |  1000000024 | t  | 2023-04-25 01:02:41.932061+00 | 2023-04-25 01:02:41.935592+00 | t       |               | f
 27 |                             2 | codeinsights_schema_migrations |  1000000025 | t  | 2023-04-25 01:02:41.936153+00 | 2023-04-25 01:02:41.93843+00  | t       |               | f
 28 |                             2 | codeinsights_schema_migrations |  1000000026 | t  | 2023-04-25 01:02:41.938901+00 | 2023-04-25 01:02:41.940873+00 | t       |               | f
 29 |                             2 | codeinsights_schema_migrations |  1000000027 | t  | 2023-04-25 01:02:41.941345+00 | 2023-04-25 01:02:41.987038+00 | t       |               | f
 30 |                             2 | codeinsights_schema_migrations |  1646761143 | t  | 2023-04-25 01:02:41.987751+00 | 2023-04-25 01:02:41.989821+00 | t       |               | f
 31 |                             2 | codeinsights_schema_migrations |  1647894746 | t  | 2023-04-25 01:02:41.990299+00 | 2023-04-25 01:02:41.99209+00  | t       |               | f
 32 |                             2 | codeinsights_schema_migrations |  1649801281 | t  | 2023-04-25 01:02:41.992518+00 | 2023-04-25 01:02:41.994545+00 | t       |               | f
 33 |                             2 | codeinsights_schema_migrations |  1651021000 | t  | 2023-04-25 01:02:41.994978+00 | 2023-04-25 01:02:41.997234+00 | t       |               | f
 34 |                             2 | codeinsights_schema_migrations |  1652289966 | t  | 2023-04-25 01:02:41.997669+00 | 2023-04-25 01:02:41.999264+00 | t       |               | f
 35 |                             2 | codeinsights_schema_migrations |  1656517037 | t  | 2023-04-25 01:02:41.999735+00 | 2023-04-25 01:02:42.001835+00 | t       |               | f
 36 |                             2 | codeinsights_schema_migrations |  1656608833 | t  | 2023-04-25 01:02:42.002347+00 | 2023-04-25 01:02:42.004296+00 | t       |               | f
 37 |                             2 | codeinsights_schema_migrations |  1659572248 | t  | 2023-04-25 01:02:42.004756+00 | 2023-04-25 01:02:42.006553+00 | t       |               | f
 38 |                             2 | codeinsights_schema_migrations |  1663626068 | t  | 2023-04-25 01:02:44.062459+00 | 2023-04-25 01:02:44.064584+00 | t       |               | f
 39 |                             2 | codeinsights_schema_migrations |  1664984848 | t  | 2023-04-25 01:02:44.06497+00  | 2023-04-25 01:02:44.071863+00 | t       |               | f
 40 |                             2 | codeinsights_schema_migrations |  1665003565 | t  | 2023-04-25 01:02:44.072484+00 | 2023-04-25 01:02:44.143285+00 | t       |               | f
 41 |                             2 | codeinsights_schema_migrations |  1665616961 | t  | 2023-04-25 01:02:44.143963+00 | 2023-04-25 01:02:44.153138+00 | t       |               | f
 42 |                             2 | codeinsights_schema_migrations |  1665053848 | t  | 2023-04-25 01:02:44.153645+00 | 2023-04-25 01:02:44.156735+00 | t       |               | f
 43 |                             2 | codeinsights_schema_migrations |  1666632478 | t  | 2023-04-25 01:02:44.157141+00 | 2023-04-25 01:02:44.159002+00 | t       |               | f
 44 |                             2 | codeinsights_schema_migrations |  1666729025 | t  | 2023-04-25 01:02:44.159342+00 | 2023-04-25 01:02:44.165224+00 | t       |               | f
 45 |                             2 | codeinsights_schema_migrations |  1667309737 | t  | 2023-04-25 01:02:44.165651+00 | 2023-04-25 01:02:44.16846+00  | t       |               | f
 46 |                             2 | codeinsights_schema_migrations |  1670253074 | t  | 2023-04-25 01:02:46.351936+00 | 2023-04-25 01:02:46.358269+00 | t       |               | f
 47 |                             2 | codeinsights_schema_migrations |  1672740238 | t  | 2023-04-25 01:02:46.358884+00 | 2023-04-25 01:02:46.364576+00 | t       |               | f
 48 |                             2 | codeinsights_schema_migrations |  1672917501 | t  | 2023-04-25 01:02:46.365045+00 | 2023-04-25 01:02:46.372462+00 | t       |               | f
 49 |                             2 | codeinsights_schema_migrations |  1672921606 | t  | 2023-04-25 01:02:46.373545+00 | 2023-04-25 01:02:46.376849+00 | t       |               | f
 50 |                             2 | codeinsights_schema_migrations |  1674474174 | t  | 2023-04-25 01:02:46.377447+00 | 2023-04-25 01:02:46.381209+00 | t       |               | f
 51 |                             2 | codeinsights_schema_migrations |  1675113463 | t  | 2023-04-25 01:02:46.381774+00 | 2023-04-25 01:02:46.444462+00 | t       |               | f
 52 |                             2 | codeinsights_schema_migrations |  1675347548 | t  | 2023-04-25 01:02:46.445052+00 | 2023-04-25 01:02:46.447165+00 | t       |               | f
(52 rows)

postgres=# select * from migration_logs where version=1000000027;
 id | migration_logs_schema_version |             schema             |  version   | up |          started_at           |          finished_at          | success | error_message | backfilled
----+-------------------------------+--------------------------------+------------+----+-------------------------------+-------------------------------+---------+---------------+------------
 29 |                             2 | codeinsights_schema_migrations | 1000000027 | t  | 2023-04-25 01:02:41.941345+00 | 2023-04-25 01:02:41.987038+00 | t       |               | f
(1 row)

postgres=# select * from migration_logs where version=1000000010;
 id | migration_logs_schema_version |             schema             |  version   | up |          started_at           |          finished_at          | success | error_message | backfilled
----+-------------------------------+--------------------------------+------------+----+-------------------------------+-------------------------------+---------+---------------+------------
 12 |                             2 | codeinsights_schema_migrations | 1000000010 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
(1 row)

postgres=# select * from migration_logs where version=1000000004;
 id | migration_logs_schema_version |             schema             |  version   | up |          started_at           |          finished_at          | success | error_message | backfilled
----+-------------------------------+--------------------------------+------------+----+-------------------------------+-------------------------------+---------+---------------+------------
  6 |                             2 | codeinsights_schema_migrations | 1000000004 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
(1 row)

postgres=# select * from migration_logs where version=1000000001;
 id | migration_logs_schema_version |             schema             |  version   | up |          started_at           |          finished_at          | success | error_message | backfilled
----+-------------------------------+--------------------------------+------------+----+-------------------------------+-------------------------------+---------+---------------+------------
  3 |                             2 | codeinsights_schema_migrations | 1000000001 | t  | 2023-04-25 00:55:41.679537+00 | 2023-04-25 00:55:41.679537+00 | t       |               | t
(1 row)
DaedalusG commented 1 year ago

As a convenience for readers the following sql is runnable in a psql shell to the codeinsights-db to amend the drift:

ALTER TABLE series_points ADD CONSTRAINT series_points_metadata_id_fkey FOREIGN KEY (metadata_id) REFERENCES metadata(id) ON DELETE CASCADE DEFERRABLE;
ALTER TABLE series_points ADD CONSTRAINT series_points_original_repo_name_id_fkey FOREIGN KEY (original_repo_name_id) REFERENCES repo_names(id) ON DELETE CASCADE DEFERRABLE;
ALTER TABLE series_points ADD CONSTRAINT series_points_repo_name_id_fkey FOREIGN KEY (repo_name_id) REFERENCES repo_names(id) ON DELETE CASCADE DEFERRABLE;
CREATE INDEX series_points_original_repo_name_id_btree ON series_points USING btree (original_repo_name_id);
CREATE INDEX series_points_repo_id_btree ON series_points USING btree (repo_id);
CREATE INDEX series_points_repo_name_id_btree ON series_points USING btree (repo_name_id);
CREATE INDEX series_points_series_id_btree ON series_points USING btree (series_id);
CREATE INDEX series_points_series_id_repo_id_time_idx ON series_points USING btree (series_id, repo_id, "time");
DROP INDEX series_points_vanilla_original_repo_name_id_idx;
DROP INDEX series_points_vanilla_repo_id_idx;
DROP INDEX series_points_vanilla_repo_name_id_idx;
DROP INDEX series_points_vanilla_series_id_idx;
DROP INDEX series_points_vanilla_series_id_repo_id_time_idx;
DROP INDEX series_points_vanilla_time_idx;
COMMENT ON TABLE series_points IS 'Records events over time associated with a repository (or none, i.e. globally) where a single numerical value is going arbitrarily up and down. Repository association is based on both repository ID and name. The ID can be used to refer toa specific repository, or lookup the current name of a repository after it has been e.g. renamed. The name can be used to refer to the name of the repository at the time of the events creation, for example to trace the change in a gauge back to a repository being renamed.';
DROP INDEX series_points_snapshots_time_idx;
efritz commented 1 year ago

@DaedalusG When was timescale-db deprecated? Let's call that version v3.X. My assumptions would be:

Can we validate if any of these are false? If they're all true then it might not be a squashing issue but the deprecation of the timescale-db objects itself. Could help narrow down what's actually affected here.