mholt / photobak

Back up your content from Google Photos - DEPRECATED: use Timeliner
https://github.com/mholt/timeliner
309 stars 29 forks source link

Handle HTTP errors properly #8

Closed KonishchevDmitry closed 7 years ago

KonishchevDmitry commented 7 years ago

I've got a number of strange checksum mismatch, re-downloading errors on my repository and decided to investigate the reason. It turned out that current implementation of googlephotos module doesn't check HTTP status code of HTTP response and writes body of non-200 OK HTTP response in place of the photo to repository (!) + there is a bug in repo.go due to which all download errors are ignored because invalid err is used.

mholt commented 7 years ago

By the way, thanks so much for contributing this. I've been noticing behavior like this on my end once in a while too (but sparingly enough that I didn't worry about it). Maybe this will fix it!

KonishchevDmitry commented 7 years ago

You are welcome! Thank you for the project. :) By the way, I highly recommend you to fully erase your current repo and redownload it from scratch because there is a chance that you have broken files in it: you get checksum mismatch, re-downloading errors only in rare cases and most errors could be silently ignored and will never be detected.

mholt commented 7 years ago

Good point. So this is probably because there was an HTTP error during the first download of this file, so every subsequent successful download doesn't match because the wrong hash was stored.

I am at work at the moment so I don't have a chance to look in the code (and it's been a little while, so I've forgotten), but should we update the stored hash after a successful download? (If we don't already?) That way it should compare against the "latest" hash instead of always thinking it is corrupt.

KonishchevDmitry commented 7 years ago

The hash is updated, but we can't tell whether the file is corrupted until we can't compare our sha256 checksum with Google Photo's etag, which doesn't looks like a checksum.

mholt commented 7 years ago

LGTM. Thanks so much! I'll try this out and restore the state of my repository; instead of re-downloading everything, I'll probably just compile in a quick hack where if there is a checksum mismatch, it re-downloads (using this updated code) and then stores the correct checksum in the DB after a successful download. Then I'll revert to a version without that hack.