bitbybyte / fantiadl

Download posts and media from Fantia
MIT License
299 stars 51 forks source link

Populate database with exists posts #128

Closed Suika closed 3 months ago

Suika commented 4 months ago

Add an option to add already downloaded posts to the database. It seems that only newly scraped posts are added to the database, while already downloaded posts are reported as "Post appears to have been downloaded completely" but are never added to the database. Which in turn seems to cause an issue of too many requests #127, because they are not prematurely skipped unlike the posts inside the db.

bitbybyte commented 4 months ago

Have you tried with --db-bypass-post-check? That output shows when the post is in the database, but by default we recheck all posts because contents can change/update/get added.

Suika commented 3 months ago

Sure, that one --db-bypass-post-check will skip posts that are in the DB, but what about posts that were downloaded predating the db feature?

bitbybyte commented 3 months ago

That message only appears when a post exists in the database: https://github.com/bitbybyte/fantiadl/blob/master/models.py#L533

And the insert happens at L542 when it doesn't.

bitbybyte commented 3 months ago

@Suika If you believe there's a case where this doesn't apply let me know, otherwise I'll be closing this issue.