Closed GoogleCodeExporter closed 9 years ago
Agreed. This is a desirable feature.
There are some things that complicate the implementation though. The default
behaviour is that new snapshots uses a previous snapshot as a template, and
only specifies what files have changed. This makes it almost free to make small
changes to a session. But then you cannot remove an earlier snapshot, since
later snapshots depend on it. And rewriting the snapshot definition files will
violate the "write-once" paradigm that restricts the chances of bugs causing
repository corruption. Alternatively, one could set the repository so that
every snapshot is independent. This causes a big overhead if you only modify a
few files in every commit, but removing snapshots is then trivial. One could of
course write a new snapshot definition, concurrent with the old one, and keep
both, assuming that it is only the bulk file data that we want to purge. Just
some thoughts. In any case, this needs to be implemented.
Original comment by ekb...@gmail.com
on 15 Mar 2011 at 11:00
I'm inclined to the last solution, that is if we can somehow superimpose a
patch definition, which works like a mask over a certain snapshot, and all
following snapshots only applies to the area defined by the mask.
Another solution I thought of was to allow independent snapshots and dependent
snapshots together, so that we retain the storage optimization advantage of
linked snapshots, but also achieve the flexibility of independent snapshots.
Original comment by uts...@gmail.com
on 16 Mar 2011 at 2:24
I have started implementing this feature. The suggested solution is that only
the latest revision of a session is preserved. Essentially, it will be possible
to "truncate" a session so that it only contains the data present in the most
recent snapshot.
Original comment by ekb...@gmail.com
on 24 Feb 2012 at 1:26
Feature added as of changeset 2616bf61610d.
There is now a "truncate" command that removes old snapshots from a session.
As a safety device, an empty file named "ENABLE_PERMANENT_ERASE" must be
present in the top directory in the repository to enable the truncate command.
This file must be created manually. Without this file, "truncate" will not
function (nor will any of the lower level functions supporting the operation).
It is not yet possible to only remove selected snapshots, it's all or nothing.
Afterwards, the session will contain only whatever was in the last snapshot.
The "truncate" operation will replicate to clones if using the "clone" command.
The "ENABLE_PERMANENT_ERASE" file must be created manually in each clone before
cloning.
All the deleted session data, and all the deleted blobs, are moved to the tmp/
directory in the repository. They are placed in directories with the prefix
"TRASH_" and a random suffix. You need to delete these directories manually if
you want to free up space and delete the data permanently.
This feature is reasonably well tested. As other Boar operations, it can be
safely aborted and resumed. Still, if you chose to activate and use this
feature in your Boar repository, you are making a compromise with data safety.
Even if Boar was blessed with divine perfection, "truncate" still makes it
possible to lose data if misused. Be careful.
Original comment by ekb...@gmail.com
on 22 Apr 2012 at 8:10
Original issue reported on code.google.com by
uts...@gmail.com
on 13 Mar 2011 at 9:22