fengshao0907 / weed-fs

Automatically exported from code.google.com/p/weed-fs
0 stars 0 forks source link

File Time to Live (TTL) (enhancement) #56

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Files added should be allowed to expire by supplying a Time To Live field.

This should be accomplished when requesting a file ID. 

example of time to live for 999 seconds:

{ "_ttl": "999" }

There are many use cases for this such as caching and garbage collection.

Original issue reported on code.google.com by kevinste...@gmail.com on 26 Nov 2013 at 11:58

GoogleCodeExporter commented 9 years ago
Thanks for creating this! This is exactly what I had in mind!

Original comment by chris...@gmail.com on 27 Nov 2013 at 12:52

GoogleCodeExporter commented 9 years ago
I hope have a parameter to set timer
example
{"_ttl":"9s"}  9 seconds
{"_ttl":"9h"}  9 hours
{"_ttl":"9d"}  9 days
{"_ttl":"9m"}  9 months
...

Original comment by stvb...@gmail.com on 24 Jul 2014 at 8:32

GoogleCodeExporter commented 9 years ago
Initial design for files TTL:
One of key difference with "normal" implementation is that the TTL is more 
associated with the volume instead of each specific file.

TTL volumes usually are configured much smaller size, e.g. 100MB for 1 minute 
TTL (depending on your traffic). If the latest expiration time has been 
reached, all the files in the whole volume will be all expired, and the volume 
can be safely deleted. No need for complicated and error-prone garbage 
collection.

1) When assigning file key, the master would pick one TTL volume with the 
closest(equal or a little larger) TTL. If such volumes do not exist, create a 
few.
2) Volume servers will write the file with expiration time. When serving file, 
if the file is expired, the file will be reported as not found.
3) Volume servers will track each volume's largest expiration time, and report 
volumes that reached the largest expiration time.

Original comment by chris...@gmail.com on 25 Aug 2014 at 1:55

GoogleCodeExporter commented 9 years ago
How it will handle update TTL for a file before TTL expires?

Original comment by claudiu....@gmail.com on 25 Aug 2014 at 12:24

GoogleCodeExporter commented 9 years ago
This design would not allow TTL updates. It would require to send the file 
again with the new TTL.

Original comment by chris...@gmail.com on 25 Aug 2014 at 7:15

GoogleCodeExporter commented 9 years ago
#3: nice and simple design.
I have implemented deletion, hole punching, and hole reuse for Camlistore's 
diskpacked storage, and it is not easy.

So this garbage-collect-the-wole-volume should work easily. The only deliberate 
logic needed is in the one which decides how big, what duration and where 
should a new volume be created. But I trust Chris to find a simple, straight 
solution 😀

Original comment by tgulacs...@gmail.com on 26 Aug 2014 at 4:04

GoogleCodeExporter commented 9 years ago
NIce, was just planning to open a ticket on this, as this is a strongly desired 
feature for us, and am willing to sponsor if needed.

Solution 3 looks good, but you may want to consider:

* group TTLs in volumes. Disk space wastage would be enormous if wildly 
disparate TTLs were in the same volume. Ideally that granularity is 
configurable, but can be made automatic as well: if the TTL specified is 
multiple days ahead, take a day granularity.

* not returning files when ttl is over, even if the volume is still on store. 
This will fake the TTL better, although this will not help you in a tightly 
secure environment, as you would still be required to wipe the data ASAP after 
TTL expiry. To me this is a nice to have, and is probably fairly easy to do. 

So although 3 is nice, 1 looks better to me.

Original comment by hansboot...@gmail.com on 5 Sep 2014 at 2:45

GoogleCodeExporter commented 9 years ago
Implemented in v0.64

See https://code.google.com/p/weed-fs/wiki/FileWithTimeToLive

Original comment by chris...@gmail.com on 21 Sep 2014 at 6:21

GoogleCodeExporter commented 9 years ago
One question about your implementation: the documentation says "When assigning 
file key, the master would pick one TTL volume with matching TTL". Can you 
elaborate a bit? 
Example: I ask for a file ID with TTL=1w, and 1 minute later for another file 
ID with TTL=1w.
Would this result in systematic creation of new containers or is putting them 
in the same container still possible if spec permits? I sincerely hope the 
latter is true, otherwise you must also support absolute TTL (epoch time would 
be a trivial fit), in order to avoid the number of containers getting out of 
hand.

Original comment by hansboot...@gmail.com on 21 Sep 2014 at 2:00

GoogleCodeExporter commented 9 years ago
Each container(volume) has a specific TTL, not the expiry time.

In your case,there would only be a group of volumes of TTL=1w.

Original comment by chris...@gmail.com on 21 Sep 2014 at 4:34