mnutt / davros

Personal file storage server
Apache License 2.0
298 stars 35 forks source link

Grain size incorrect again #49

Open ovizii opened 8 years ago

ovizii commented 8 years ago

I needed to urgently transfer an ISO from one PC to another so I thought I'd give Davros a try. I created a grain, downloaded the client and configured owncloud. I then started copying the 2.9GB ISO into the owncloud folder and was quite impressed when the upload began immediately, while the file was still copying into the owncloud folder.

Once the upload was finished my grain looked like this (compare the 3.13Gb VS the 2.0GB) davros12

And on the target machine, when I installed owncloud and downloaded I ended up with a 169MB file where owncloud said: successfully synced.

All this doesn't inspire too much confidence in entrusting my files to Davros :-(

mnutt commented 8 years ago

That is pretty concerning. Could you try downloading a grain backup and see if your file is intact there? I suspect it isn't. The next thing to check would be the owncloud client logs on the sending machine. Uploading large files involves sending 5MB chunks and concatenating, and there may be an issue there.

ovizii commented 8 years ago

So when I click the download icon, it spins and spins and spins and then eventually starts downloading a 2.8GB zip file which could indicate its the 2.9GB sized ISO file but the download always corrupts so I cannot check. It might only be of a cosmetical nature showing me a size of 2.0GB in the browser?

The client log is attached. owncloudsync.log.txt

ocdtrekkie commented 8 years ago

I was wondering why my Oasis account went from like 250 MB to 1.03 GB of usage. Even though when I opened the grain, it said I had 103 MB of stuff in the one Davros grain I have photo uploads sent to, it accounted for 700 MB of space Oasis' storage calculation.

mnutt commented 8 years ago

@ocdtrekkie I believe that issue occurs when the connection is broken and owncloud-client decides to re-upload; currently we upload to a series of tempfiles and then concatenate them into place when the upload is complete, but don't do a good job of eventually deleting them later in the event of failure.

digitalcircuit commented 8 years ago

@mnutt Ran into this issue while trying to synchronize 11 GB or so with several large files (1 GB+); Sandstorm's supervisor keeps shutting down the grain despite ongoing network activity. I've filed a bug report for the Sandstorm supervisor here.

For Davros and cleaning up, does this bug report cover grain shutdown and disconnects, or should I file a new bug about managing temporary files?

Also, in the meantime, is there a safe way to clean up leftover temporary files? I've got about 2 GB of extra space used so far. I do have administrator access to the server.

mnutt commented 8 years ago

I'm looking if I can figure out a way to clear out stale temporary files. The main challenge there is just finding a heuristic that distinguishes stale files from those that just haven't finished yet.

Unfortunately at the moment the only way I know of to clear out the extra space is to download a backup of the grain, delete the tmp directory, and restore it.

paulproteus commented 8 years ago

Here's one thing you could do.

Goal: Allow Davros, while running, to periodically clear its temp directory

General strategy:

Details & references

You can use Linux "advisory locks". More info about them lives in Beej's Guide to Unix IPC

With an advisory lock system, processes can still read and write from a file while it's locked. Useless? Not quite, since there is a way for a process to check for the existence of a lock before a read or write. See, it's a kind of cooperative locking system. This is easily sufficient for almost all cases where file locking is necessary.

Interestingly, these locks go away when you close() the file descriptor. Per the close function's man page:

Any record locks (see fcntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file descriptor that was used to obtain the lock).

Therefore Davros could lock the files when it opens them, and then it wouldn't have to do any further bookkeeping down the road. (Note that you have to be sure you're OK with these "remove lock on close() semantics"!)

This has the advantage that, if Davros is killed etc., the lock is automatically removed by the kernel, so on a fresh start of Davros, Davros can perceive accurately that these temp files are not currently in use. I'm assuming here that this is desired behavior!

To use Linux advisory locking from nodejs, look into node-fs-ext and its support for flock().

This should allow Davros to stop doing rm -rf /var/davros/tmp from the launcher.sh script. Instead, Davros can unify all filesystem cleanup into the nodejs process. IMHO this would be an improvement in clarity.

For backwards compatibility with older versions of Davros, you could decide you are going to switch to a new temp directory so that old-Davros's temp files are auto-removed without regard for lock files, etc. I'm not sure that's needed but it's something you can consider.