datawhores / OF-Scraper

A completely revamped and redesigned fork, reimagined from scratch based on the original onlyfans-scraper
MIT License
699 stars 59 forks source link

Feature Request: Execute command on scraping completion (not for each user) #429

Closed dunngitter closed 3 months ago

dunngitter commented 5 months ago

I see that there's already the capability to run a command after scraping each model, but as far as I can tell there is no option to run a given command after scraping finishes altogether. I guess it's only really useful in daemon mode (otherwise, you could just run ofscraper with && otherCommand to achieve this.

It would be super helpful if this could be added, so that I can write up a quick bash script to automatically trigger Stash to rescan my scraped dir upon scraping completion.

Jakan-Kink commented 3 months ago

you can have Stash scan the folder for each user, because the arguments sent to the given command are: 1 - username 2 - user-id 3 - media 4 - posts as can be seen in line 59 - 67 of ofscraper/download/download.py https://github.com/datawhores/OF-Scraper/blob/7e318a99c057471c59eb670d45b477fd2825b915/ofscraper/download/download.py#L59-L67

So with that in hand you could then have your script call something like stashapi.stashapp.metadata_scan with the one element list for the path to that user's directory https://github.com/stg-annon/stashapi/blob/b1580be2afdbfe15b7d051311c1012dd81c158c2/stashapp.py#L294

Jakan-Kink commented 3 months ago

Or even better, like I am starting to work on over in the of-scraper-post subproject of my of-tools repo, have the post download script trigger not just the scan, but the full generate command, and then (not complete in my script as of writing this) even put the scraped metadata directly into Stash.

datawhores commented 3 months ago

This has been added but it doesn't have as much data as the post_download_script

just a simple json

 out_dict={"users":users,
             "dir_format":config_data.get_dirformat(),
             "file_format":config_data.get_fileformat(),
             "metadata":config_data.get_metadata()
             }
Jakan-Kink commented 3 months ago

just for clarity, this was added to 3.11.1, and it is post_script which can be in either of 3 locations in the config file: post_script advanced_options.post_script script_options.post_script

https://github.com/datawhores/OF-Scraper/blob/e59fe867ce149d7b5c3576f30a264e640ff25c83/ofscraper/utils/config/data.py#L283-L294

The confusing bit is the config updater moved post_download_script and post_script into

    "scripts": {
        "post_download_script": "",
        "post_script": ""
    },

so it doesn't seem to be being used.

Jakan-Kink commented 3 months ago

so after adding

    elif config.get("scripts", {}).get("post_download_script") is not None:
        val = config.get("scripts", {}).get("post_download_script")

into get_post_download_script and

    elif config.get("scripts", {}).get("post_script") is not None:
        val = config.get("scripts", {}).get("post_script")

into get_post_script the scripts started being called, however the format of the passed arguments is now completely different; but on top of that it looks like post_script is only called by ofscraper.final.final.final() by ofscraper.commands.commands.scraper.manager.execute.runner() not by daemon mode.

datawhores commented 3 months ago

Some of the

just for clarity, this was added to 3.11.1, and it is post_script which can be in either of 3 locations in the config file: post_script advanced_options.post_script script_options.post_script

https://github.com/datawhores/OF-Scraper/blob/e59fe867ce149d7b5c3576f30a264e640ff25c83/ofscraper/utils/config/data.py#L283-L294

The confusing bit is the config updater moved post_download_script and post_script into

    "scripts": {
        "post_download_script": "",
        "post_script": ""
    },

Some of that is just for backwards compatibility The function there is responsible to for finding where in the config the data the desired data is at

The final format will always follow the schema found in

ofscraper/utils/config/schema.py

So you can always put it in those three locations but it will always end up in the same place once the config finishes processing

one of those locations is actually for the prompt menu so that a flatten config can be processed before formatting. Otherwise I would have to write into the prompt menu were to update the config

so after adding

    elif config.get("scripts", {}).get("post_download_script") is not None:
        val = config.get("scripts", {}).get("post_download_script")

into get_post_download_script and

    elif config.get("scripts", {}).get("post_script") is not None:
        val = config.get("scripts", {}).get("post_script")

into get_post_script the scripts started being called, however the format of the passed arguments is now completely different; but on top of that it looks like post_script is only called by ofscraper.final.final.final() by ofscraper.commands.commands.scraper.manager.execute.runner() not by daemon mode.

I realized there were a lot of places were it wasn't being called so I redid it for 3.11.2

dunngitter commented 3 months ago

Nice, thanks! In my particular case I don't really need much info, I just need it to trigger a script upon scraping completion so I can trigger a scan in my stash instance that holds this content.

I haven't tried this yet but I will as soon as I get a chance, and I'll report back. Thanks for adding it!

datawhores commented 3 months ago

Closing because the feature was already added