bbolli / tumblr-utils

Utilities for dealing with Tumblr blogs, Tumblr backup
GNU General Public License v3.0
667 stars 124 forks source link

Incomplete download problem (found 702 posts, download only 111 of them) #151

Closed luboq closed 5 years ago

luboq commented 5 years ago

snip

cebtenzzre commented 5 years ago

@waldens Your "example post" gives a 404.

cebtenzzre commented 5 years ago

I was able to reproduce the issue with standard tumblr-utils, however with the changes proposed by #114 (tested using a git clone of this) I was able to successfully download 319 posts. I'm still not sure why it's less than the expected 700+.

I compared the results to an equivalent download done by TumblThree, and I found that the performance of tumblr-utils was far better -- not only did TumblThree find ~100 less likes, but it also just downloaded a dump of images and videos instead of a proper index. As far as I know, TumblThree scrapes the /liked/by page directly.