Open RichardTaylor opened 2 years ago
This download would be absolutely gigantic.
The only way I could see this being implemented is by creating a continually updated and published torrent. I don't at all know if that's possible. The other option is something like https://ipfs.io.
This download would be absolutely gigantic.
Perhaps a text only option, including text extracted from certain attachments, might help make the size of a version of the archive more manageable.
In terms of external storage/archiving see:
https://archive.org/details/datasets
and
https://www.reddit.com/r/datasets https://www.reddit.com/r/bigdata
Out of interest, how big would a text only version be without the attachments?
Some raw statistics from server migration earlier in the year (stats taken on 2021-12-10):
Install | Database Size | Files | Raw Emails | Xapian | Migrated |
---|---|---|---|---|---|
WDTK | 34G | 8.9G | 449G | 26G |
If we were doing this we'd want to include https://github.com/mysociety/alaveteli/issues/7534.
Make everything publicly accessible on an Alaveteli website available in one structured download.
This could
This has been discussed many times, I couldn't find a ticket for it though, so I'm starting this one.