fake-name / xA-Scraper

69 stars 8 forks source link

[SUGGESTION] Allow different database for each art site #75

Closed God-damnit-all closed 4 years ago

God-damnit-all commented 4 years ago

@fake-name As it stands, I really want to reset the database I have for Twitter since I accidentally have a bunch of filenames from when I was testing spliced in (I thought I'd made a backup before testing but apparently I only convinced myself I did).

I thought I could just somehow drop all the tables related to twitter but it looks like that is no small feat due to all the relationships between each table. Needless to say, it did not go well. (Luckily I did do a backup before I tried that, though.)

So I'm starting to think I might just rebuild my database from scratch, probably by running it behind a VPN on a VM. However, it made me realize that there's really no good reason for them to all be part of the same database when there's absolutely nothing linking their functionality together aside from some basic aggregation on the web interface's index page.

Given that platforms have been prone to making very sweeping changes that could render a lot of older data obsolete, could each platform get its own database? I don't really care about them sharing the same host/user/passwd, I just would like to have the ability to nuke one platform without affecting the rest.

fake-name commented 4 years ago

Oh gross, no. Multiple databases would be horribly messy.

Dropping all artists from one site isn't that hard. You'd need to do a few compound queries but you can use DELETE FROM art_item WHERE artist_id IN (SELECT id FROM scrape_targets WHERE site_name='twit' AND artist_name='<name>'); or similar.

You'd need to further nest for deleting file references, but SQL isn't too hard to get working. I like to start with a SELECT query until I'm confident I'm getting just the db entries I want.