mnutt / davros

Personal file storage server
Apache License 2.0
298 stars 35 forks source link

WARNING: etag of Apr is ` `. This must not happen. #22

Closed paulproteus closed 8 years ago

paulproteus commented 8 years ago

When synchronizing, I get a message like this sometimes:

!!! OCC::LsColJob created for "https://api.rose.sandcats.io/" + "//////2008"
WARNING: etag of Apr is   This must not happen.
WARNING: etag of Aug is   This must not happen.
WARNING: etag of Dec is   This must not happen.
WARNING: etag of Feb is   This must not happen.
WARNING: etag of Jan is   This must not happen.
WARNING: etag of Jul is   This must not happen.
WARNING: etag of Jun is   This must not happen.
WARNING: etag of Mar is   This must not happen.
WARNING: etag of May is   This must not happen.
WARNING: etag of Nov is   This must not happen.
WARNING: etag of Oct is   This must not happen.
WARNING: etag of Sep is   This must not happen.
void OCC::DiscoveryMainThread::singleDirectoryJobResultSlot(const QList<OCC::FileStatPointer>&) Have 13 results for  "//////2008"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x55a02ff733d0) /////2008 ...Returned from main thread
[12/19 17:19:10.503436, 9] csync_walker:  directory: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//////2008/Apr [file_id=]
[12/19 17:19:10.503493, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:19:10.503508, 8] _csync_detect_update:  file: /////2008/Apr, instruction: INSTRUCTION_NEW <<=
[12/19 17:19:10.503512, 4] csync_ftw:  URI without fuzz for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//////2008/Apr is "/////2008/Apr"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x55a02ff733d0) /////2008/Apr Calling into main thread...
!!! OCC::LsColJob created for "https://api.rose.sandcats.io/" + "//////2008/Apr"
WARNING: etag of 22 is   This must not happen.
void OCC::DiscoveryMainThread::singleDirectoryJobResultSlot(const QList<OCC::FileStatPointer>&) Have 2 results for  "//////2008/Apr"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x55a02ff733d0) /////2008/Apr ...Returned from main thread
[12/19 17:19:10.699804, 9] csync_walker:  directory: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//////2008/Apr/22 [file_id=]

Note the etag warning. Should I be worried?

Also, here's a read-only sharing link to my grain: https://rose.sandcats.io/shared/qO_9RFvfldvC46x4Mj1DGuOxrnHtB3HYRuuhk9tUrBo

Of note:

paulproteus commented 8 years ago

P.S. Thanks so much for davros. FWIW this is w/r/t the Davros grain that powers the static publishing behind asheesh.org. This shows up in my yak-shave queue because I'm trying to figure out why a new directory I added, html.flav, isn't getting synchronized up to the server, which is where my website's CSS is.

paulproteus commented 8 years ago

Here is the transition from no-duplicate-slashes to duplicate-slashes in the log, fwiw:

[12/19 17:48:25.391856, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:48:25.391864, 9] _csync_detect_update:  Database entry found, compare: 1448602140 <-> 1448176660, etag: 9550e4766bafe2f8bd9466ea8584a668 <-> 9550e4766bafe2f8bd9466ea8584a668, inode: 0 <-> 330609, size: 187009 <-> 187009, perms:  <-> , ignore: 0
[12/19 17:48:25.391870, 8] _csync_detect_update:  file: 2015/Nov/index.html, instruction: INSTRUCTION_NONE <<=
static void OCC::DiscoveryJob::remote_vio_closedir_hook(csync_vio_handle_t*, void*) OCC::DiscoveryJob(0x562888667560) "/2015/Nov"
[12/19 17:48:25.391882, 9] csync_ftw:   <= Closing walk for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav/2015/Nov with read_from_db 0
[12/19 17:48:25.391889, 9] csync_walker:  file: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav/2015/index.html [file_id= size=16467]
[12/19 17:48:25.391914, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:48:25.391921, 9] _csync_detect_update:  Database entry found, compare: 1448602139 <-> 1448601208, etag: 733c36ad909919d2e49585f9aa249286 <-> 733c36ad909919d2e49585f9aa249286, inode: 0 <-> 330610, size: 16467 <-> 16467, perms:  <-> , ignore: 0
[12/19 17:48:25.391927, 8] _csync_detect_update:  file: 2015/index.html, instruction: INSTRUCTION_NONE <<=
static void OCC::DiscoveryJob::remote_vio_closedir_hook(csync_vio_handle_t*, void*) OCC::DiscoveryJob(0x562888667560) "/2015"
[12/19 17:48:25.391939, 9] csync_ftw:   <= Closing walk for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav/2015 with read_from_db 0
[12/19 17:48:25.391945, 9] csync_walker:  directory: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav/ [file_id=]
[12/19 17:48:25.391962, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:48:25.391969, 8] _csync_detect_update:  file: , instruction: INSTRUCTION_NEW <<=
[12/19 17:48:25.391973, 4] csync_ftw:  URI without fuzz for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav/ is ""
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x562888667560)  Calling into main thread...
!!! OCC::LsColJob created for "https://api.rose.sandcats.io/" + ""
WARNING: etag of 2008 is   This must not happen.
WARNING: etag of 2009 is   This must not happen.
WARNING: etag of 2010 is   This must not happen.
WARNING: etag of 2011 is   This must not happen.
WARNING: etag of 2012 is   This must not happen.
WARNING: etag of 2013 is   This must not happen.
WARNING: etag of 2014 is   This must not happen.
WARNING: etag of 2015 is   This must not happen.
WARNING: etag of  is   This must not happen.
WARNING: etag of about is   This must not happen.
WARNING: etag of credits is   This must not happen.
WARNING: etag of dwad is   This must not happen.
WARNING: etag of gila is   This must not happen.
WARNING: etag of mmm is   This must not happen.
WARNING: etag of note is   This must not happen.
WARNING: etag of projects is   This must not happen.
WARNING: etag of pub is   This must not happen.
WARNING: etag of rmad is   This must not happen.
WARNING: etag of scribble is   This must not happen.
void OCC::DiscoveryMainThread::singleDirectoryJobResultSlot(const QList<OCC::FileStatPointer>&) Have 21 results for  ""
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x562888667560)  ...Returned from main thread
[12/19 17:48:25.625073, 9] csync_walker:  directory: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//2008 [file_id=]
[12/19 17:48:25.625124, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:48:25.625136, 8] _csync_detect_update:  file: /2008, instruction: INSTRUCTION_NEW <<=
[12/19 17:48:25.625141, 4] csync_ftw:  URI without fuzz for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//2008 is "/2008"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x562888667560) /2008 Calling into main thread...
!!! OCC::LsColJob created for "https://api.rose.sandcats.io/" + "//2008"
WARNING: etag of Apr is   This must not happen.
WARNING: etag of Aug is   This must not happen.
WARNING: etag of Dec is   This must not happen.
WARNING: etag of Feb is   This must not happen.
WARNING: etag of Jan is   This must not happen.
WARNING: etag of Jul is   This must not happen.
WARNING: etag of Jun is   This must not happen.
WARNING: etag of Mar is   This must not happen.
WARNING: etag of May is   This must not happen.
WARNING: etag of Nov is   This must not happen.
WARNING: etag of Oct is   This must not happen.
WARNING: etag of Sep is   This must not happen.
void OCC::DiscoveryMainThread::singleDirectoryJobResultSlot(const QList<OCC::FileStatPointer>&) Have 13 results for  "//2008"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x562888667560) /2008 ...Returned from main thread
[12/19 17:48:25.832043, 9] csync_walker:  directory: ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//2008/Apr [file_id=]
[12/19 17:48:25.832100, 8] sqlite_profile:  _SQL_ SELECT * FROM metadata WHERE phash=?1: 0
[12/19 17:48:25.832114, 8] _csync_detect_update:  file: /2008/Apr, instruction: INSTRUCTION_NEW <<=
[12/19 17:48:25.832118, 4] csync_ftw:  URI without fuzz for ownclouds://sandstorm:(removed)@api.rose.sandcats.io/remote.php/webdav//2008/Apr is "/2008/Apr"
static csync_vio_handle_t* OCC::DiscoveryJob::remote_vio_opendir_hook(const char*, void*) OCC::DiscoveryJob(0x562888667560) /2008/Apr Calling into main thread...
!!! OCC::LsColJob created for "https://api.rose.sandcats.io/" + "//2008/Apr"
mnutt commented 8 years ago

I believe OwnCloud maintains ETags of directories. These are hashes-of-hashes and work because OwnCloud maintains a mysql database of metadata, and pre-calculates ETags using md5 or sha1 or something on the file's contents. Any time a file changes, its etag gets updated and the etags of all of its ancestor directories also get recalculated.

Davros is a bit lazier, and computes etags very cheaply at access time by just looking at the filename + inode + mtime. Given a directory, it wouldn't be able to efficiently calculate a directory's etag since it would involve hashing potentially hundreds of large files.

In the future, if Davros gets a real metadata system other than the filesystem this should be possible. For now, I don't think this should hurt anything. It's just slightly less efficient.

paulproteus commented 8 years ago

Thanks for this reply! I guess it's sensible enough to keep this open for now? Up to you.

mnutt commented 8 years ago

I'm hoping https://github.com/mnutt/davros/pull/53 will fix this.

mnutt commented 8 years ago

I think this should be fixed too! Please let me know, if not.