haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.18k stars 1.54k forks source link

Questions on library history, quota and space #1204

Closed qnxor closed 8 years ago

qnxor commented 9 years ago

These are mostly questions which I couldn't answer myself using the manual.

How does the file/library history work for encrypted libraries? As I can see, it seems to do a snapshot of the entire library whenever a file in it is modified, but it's smart enough to mirror (hard links?) all other files that were unchanged, so basically the size of a snapshot is equal to the size of only the file that changed.

Suppose I have not set any limit for the history. Does the quota affect the library history size as well, or just the size of the library? If yes, how exactly (e.g. are the oldest snapshots deleted to fit the quota)? If not, is there any mechanism to confine the library and its history/snapshots within a certain amount of space, instead of number of days? How about max number of snapshots?

Thanks.

qnxor commented 9 years ago

Bump, anyone?

qnxor commented 9 years ago

Hello? If I missed this info in the manual then I'd also appreciate a pointer (and will accept the bollocking)

lins05 commented 9 years ago

so basically the size of a snapshot is equal to the size of only the file that changed.

Yes.

Does the quota affect the library history size as well.

No. The quota only concerns the current size of the library.

If not, is there any mechanism to confine the library and its history/snapshots within a certain amount of space, instead of number of days?

Not yet.

qnxor commented 9 years ago

Right, so effectively Seafile can grow to be very large. How does it handle the disk running out of space? Does it delete old snapshots or just fail to write and complain?

Can it be confined to a maximum number of snapshots per file?

Last question: is there a safe way to manually (externally) delete snapshots on the server without breaking the Seafile file structure? I could implement a cron job to delete old snapshots when free space approaches 0.

shoeper commented 9 years ago

Currently it is only possible to limit the age of a libraries history. And there is an option to set a hard limit for maximum age. Users won't be able to set a higher age in the case a hard limit has been set.

qnxor commented 9 years ago

What about my other question: is there a safe way to manually (externally) delete snapshots on the server without breaking the Seafile file structure? I could then implement a cron job to delete old snapshots when free space approaches 0.

shoeper commented 9 years ago

No. But if you haven't got pro version you can run seseaf-gc (stop seafile, run sead-gc, start seafile) every night to make sure old data is removed eaxh night.

qnxor commented 9 years ago

What do you mean by "old data"? Would it delete the entire history?

Also, can you please consider adding the functionality of limiting the history also by size, not just by time? Otherwise libraries can grow out of control. I can make a new issue for this.

shoeper commented 9 years ago

Old data is only data not needed anymore (removed libraries, blocks which are not used by any library anymore, ...).

Because Seafile uses blocks the history does not grow out of control in almost all cases. The only issue I see is that it is not possible to remove specific files from history (this has advantages as well as disadvantages).

qnxor commented 9 years ago

Thanks for the info. Help me understand something though. If my library is 10 GB and I limit the history to 30 days, then I change all Library files every day, would it not grow to 300 GB?

Do you mean that Seafile saves to history only modified blocks instead of modified files? That would indeed help reducing the history size a lot.

What about uploads? Does Seafile upload only modified blocks (instead of modified files in full)?

shoeper commented 9 years ago

Seafile does only save blocks and while using the desktop client it only uploads changed blocks. Your library wouldn't grow that much. I run a Seafile for quite a while now and it does not need that much space. If you wanted I could have a closer look at my space consumption tomorrow.

qnxor commented 9 years ago

If the server only saves changed blocks to the history then I'm more relieved. I'll watch the space myself, obviously, but it would still be nice to have the option of a history size quota as well, so that the server can automatically prune the oldest history entries if the history size limit is hit.

shoeper commented 9 years ago

Yeah of course more options regarding the history could be helpful.

Just one small hint: Seafile doesn't delete anything until you run Seafile GC (garbage collector). So until you do size of your data folder won'twon't decrease.

qnxor commented 9 years ago

Wait ... then what does the option of max days for history do? Do I have to stop seafile, run the gc then restart seafile in order to get rid of the history that is older than the limit set in the options?

shoeper commented 9 years ago

Yes. Only exception is the pro version having a live gc (I do not have it). But you could create a small script that stops server in the night, runs the gc and starts the seafiler server again (this won't take much time as long as you don't have hundrets maybe thousands of Gigabytes data).

qnxor commented 9 years ago

So do I take it that the max-days of history doesn't do anything? Why is the option there then? Does Seafile create snapshots forever? Something doesn't make sense here...

I intend to have a few users from many parts of the world, so stopping the service whenever I think is ok sounds quite awkward. I already have close to 100 GB and it's only been a couple of weeks.

shoeper commented 9 years ago

The history limitation is for the garbage collector. Run it once a week and everything will be fine.

If live gc is important for you, you could buy the Pro version.

qnxor commented 9 years ago

A quota feature that requires the entire service to be interrupted for all users does not sound fine in our case - we are a big team of academics interested in deploying Seafile and we will all have quite big libraries, together totaling 4+ TB. We were looking at Seafile to avoid paying for cloud storage.

Do the desktop clients recover seamlessly when the server goes down then comes back up or do they have to be manually restarted too? Any idea how long, roughly, would the gc (downtime) take if the data on the server from all users totals 4-8 TB ?

I quite appreciate the time you took answering my ongoing questions! Thanks again.

shoeper commented 8 years ago

Do the desktop clients recover seamlessly when the server goes down then comes back up or do they have to be manually restarted too?

Yes

Any idea how long, roughly, would the gc (downtime) take if the data on the server from all users totals 4-8 TB ?

It is hard to say. Here it takes about 45 minutes for 800 GiB.