src-d / borges

borges collects and stores Git repositories.
https://docs.sourced.tech/borges/
GNU General Public License v3.0
52 stars 20 forks source link

Tool to fix problems with siva files #367

Closed jfontan closed 5 years ago

jfontan commented 6 years ago

Problems

Possible solutions

List of proposed commands and which ones are needed now

Lost gluster brick

Accessing gluster filesystem to check the presence of a file can be very expensive. Instead of using gluster we can take the list of files directly from the bricks (native FS) and have this list loaded in memory to find missing siva files:

Notes

siva files not it buckets

A similar list can be done the same way as with lost gluster brick problem with the bucketed siva files and another one with the ones incorrectly placed. The siva files that are in both lists must be deleted and repositories requeued. The list of siva files that are only in the incorrect path must be moved to its proper bucket dir.

Notes

General notes

jfontan commented 6 years ago

DISTINCT should not be used to get the init commits as is way slower than getting all:

testing=# select count(init) from repository_references;
   count   
-----------
 233888927
(1 row)

Time: 53338.143 ms
testing=# select count(distinct(init)) from repository_references;
  count   
----------
 11849984
(1 row)

Time: 1209028.016 ms
jfontan commented 6 years ago

To make it more flexible and easier to extend for unforeseen cases there can be commands that create lists and others that do some actions over them. Kind of unix piping style. Most of the list and filter commands can be done with standard unix commands so they may not need to be implemented.

Some examples of commands could be:

tool siva database [<db connection string>]
tool siva disk <directory>
tool siva filter by-bucket <root dir> <list>
tool siva filter repeated <list1> <list2>
tool siva filter only-first <list1> <list2>
tool siva rebucket [--path <rooted-repos>] 0 2 <list>
tool siva delete [--path <rooted-repos>] [--bucket 2] <list>
tool siva repos <list>
tool repos siva <list>
jfontan commented 6 years ago

Issues related to implement the tool: