Databases - Githubissues

nemequ / squash-corpus

Designing a new corpus for lossless general-purpose compression

15 stars 2 forks source link

Spoke to some people on IRC in the postgresql channel. For postgres there are several options for backups, but if we need to choose one apparently the best choice would be pg_dump -F p.

1 GiB is apparently pretty much the smallest database considered to be of respectable size, and that is obviously way too large to include in this corpus. There is a Sample Databases page on the postgres wiki, they're all pretty large. Perhps load one into postgres and delete a random set of rows (like 90% of them), then use that…

nemequ / squash-corpus

Databases #8