cebtenzzre / tumblr-utils

A fork of tumblr-utils with Python 3 support, bug fixes, and lots of features I found useful.
GNU General Public License v3.0
40 stars 8 forks source link

Only backups up from 2017 til now #6

Closed neftd closed 2 years ago

neftd commented 2 years ago

Hello, I hope I'm posting this in the right place. If not, please let me know.

I have the problem that my blog only gets backed from 2017 to now, whether I run it with the incremental flag or not. When I try to backup a specific year before 2017 it doesn't work either.

The version: bc43e85 I downloaded this version on the 11th of December.

The first time, which seemed to have got stuck at the time: tumblr_backup.py --tag-index --save-audio --save-video blogname

The second time, to make sure everything had been got, but which concluded pretty quick, then I noticed it only went up to 2017: tumblr_backup.py -i --tag-index --save-audio --save-video blogname

The attempt to get something before 2017: tumblr_backup.py -p 2015 --tag-index --save-audio --save-video blogname

It rolls back media but still counts up from the whole 59000 posts or so, and then when it gets to the relevant posts (i figure) it stops. Then suddenly it goes BEYOND 59000 to 99999 but it still hasn't got anything before 2017. I've tried this several times.

See:

afbeelding

afbeelding

Nothing follows. I just press control C to break things up.

I'd be very grateful if you could take a look at this. My questions basically boil down to: what is causing this andnd how do I fix it?

Thank you very much.

cebtenzzre commented 2 years ago

When I originally implemented e7dffb1b back in 2020 I must not have tested it very well, because although not explicitly documented, the "before" and "offset" API parameters are incompatible for posts just as they are for likes ("before" takes priority).

This means -p doesn't work correctly right now, it only gets 50 posts and then gets stuck in a loop.

I should be able to fix that by walking posts by timestamp instead of offset when -p is given, which we already do for likes. The post number in the status output always starts at zero by the way, the only way to know how many posts come after a given timestamp is if you start from the beginning.

cebtenzzre commented 2 years ago

I just pushed 9ed65397. Check out the latest version and see if -p works for you. If it downloads no posts then it's probably some API weirdness, I could suggest some modifications to aid in debugging that.

neftd commented 2 years ago

Thank you!! I will try it out.

I have tried it out.

What I did is this: I copied all the new files in the zip file to the folder that also contained my original backup (i.e. the folder called blogname) to replace the old ones. Then I ran:

tumblr_backup.py -p 2015 --tag-index --save-audio --save-video blogname

In the folder 'posts' there are now posts from 2015!! hooray!! However, I cannot find 2015 pages in the archive folder, and 2015 has not been added to the index file. This might be bc I fished the index file out of the 'posts' folder and pasted it a level higher so I could easily get to it.

I pasted index back into posts and ran the latest command again (with an -i) and the index did not update.

I went to the folder containing the new version (which is called: tumblr-utils-9ed65397e52b18bde1e72bc708ce314fa10ec4f0, yes?) and ran: tumblr_backup.py -p 2015 --tag-index --save-audio --save-video blogname

This got stuck but running an incremental version seemed to indicate it had everything. Unfortunately it only gave me a 'posts' and 'media' folders instead of also archive and tags (or theme for that matter) folders like with the previous version.

Have I made a foolish mistake or does this version simply not produce these?

Thank you again for looking into it!

cebtenzzre commented 2 years ago

index.html should never have been in the "posts" folder, it's always (supposed to be) generated in the root of the backup.

When you only get "posts" and "media" that's because you killed the process before it could finish (which it won't since it's the older version). The script skips generating the index in that case, but you can always regenerate it afterwards with -n 0 which means "backup no posts, just regenerate the index".

Are you sure the backup is exiting cleanly, with a Stopping backup: Reached end of period or similar followed by N posts backed up? Open a new issue if running with -n 0 still doesn't add those posts to the index, and include the full log of that command (redacted is fine) with the exit status (after the script finishes, run echo %errorlevel% on Windows or echo $? on other platforms).