Closed jar398 closed 7 years ago
(The files are currently on a 2TB spinning disk, which, like all such disks, could fail at any time. MTBF is about 5 years and it's two or three years old.)
Assigning to @mtholder hoping he can take care of this; if not, then de-assign yourself and we'll figure something else out.
I'm backing it up now. I can work on setting up the rsync. I can just skip the synthesis output
directories if those are just unpacked versions of the tar'ed form. Is that correct?
It looks I should have started this by cloning https://github.com/OpenTreeOfLife/files.opentreeoflife.org
Oops. I'll do that after the copy is complete.
Correct. I believe that all the unpacked output and taxonomy directories are redundant with the tarballs; if they're not, then I have mismanaged the unpacked directories and any differences in the unpacked directories should be ignored.
I'm not sure all changes to the small files in files.opentreeoflife.org have been written back to the git repo. Nothing really depends on these being kept in sync. In fact most of the small files should probably be deleted, in favor of apache-generated directory listings.
The total is about 10G (as you've probably found out).
Thanks for taking care of this.
Maybe in a few months there will be a more permanent solution for storing the files that are currently on files.opentreeoflife.org, such as on S3, but until then, backups of this content would be very wise. I am backing it up to my laptop from time to time, but not systematically.
I'm talking about the entire contents of the ~opentree/files.opentreeoflife.org tree on files.opentreeoflife.org (which is a CNAME for varela.csail.mit.edu).
Perhaps @mtholder is already doing this, I don't know. In any case, we need to find a volunteer, and they should set up a script to rsync the content once a day or so to a server of their choice.
It seems to be about 11G of stuff at present. It includes all archived versions of OTT and the synthetic tree, as well as archived versions of the inputs to OTT.