dogsheep / github-to-sqlite

Save data from GitHub to a SQLite database
https://github-to-sqlite.dogsheep.net/
Apache License 2.0
405 stars 43 forks source link

Repos have a big blob of JSON in the organization column #27

Closed simonw closed 4 years ago

simonw commented 4 years ago

e.g. https://github-to-sqlite.dogsheep.net/github/repos

github__repos__11_rows_where_sorted_by_updated_at_descending

This appears to be obsolete because the owner column already links to that record, albeit in the users table with type set to Organization: https://github-to-sqlite.dogsheep.net/github/users/53015001

simonw commented 4 years ago

Is it possible for a repo to have an owner that differs from its organization?

simonw commented 4 years ago

I'm going to make organization another foreign key to the users table just in case it IS possible (maybe with GitHub Enterprise or similar?)

simonw commented 4 years ago

This didn't quite work: the column type is incorrect, so the foreign key relationship isn't sticking:

https://github-to-sqlite.dogsheep.net/github/repos?organization=53015001

[organization] TEXT REFERENCES [users]([id]) - should be INTEGER.

The problem is that if the first repo inserted has no organization it's set to null, which sqlite-utils derives as a TEXT column.

One solution would be to create the column explicitly with a type, but this could get messy.

I think I want a new sqlite-utils feature for this instead.

simonw commented 4 years ago

https://github.com/simonw/sqlite-utils/issues/100 is done and released in sqlite-utils 2.7.

simonw commented 4 years ago
$ github-to-sqlite repos b.db dogsheep
$ sqlite3 b.db '.schema repos'
CREATE TABLE [repos] (
   [id] INTEGER PRIMARY KEY,
...
   [permissions] TEXT,
   [organization] INTEGER REFERENCES [users]([id]),
   FOREIGN KEY(license) REFERENCES licenses(key)
);