stashapp / CommunityScripts

This is a public repository containing plugin and utility scripts created by the Stash Community.
https://docs.stashapp.cc/add-ons/
GNU Affero General Public License v3.0
180 stars 140 forks source link

[Feature] Batch Import of Scene Metadata from Stash-Box to Stash #407

Open 8ullyMaguire opened 4 months ago

8ullyMaguire commented 4 months ago

Is your feature request related to a problem? Please describe. I often struggle to identify a specific scene from stash when I only know the studio or performer and duration.

Describe the solution you'd like I would like to be able to batch import scene metadata from stash-box into stash, specifically for a given studio or performer. This would allow me to easily sort and filter scenes by duration, making it simpler to identify the scene I'm looking for.

Describe alternatives you've considered I have considered manually searching for each scene individually, but this is time-consuming and inefficient.

Additional context This feature would significantly enhance my workflow by providing a more efficient way to manage and identify scenes in stash.

8ullyMaguire commented 3 weeks ago

I have a script that solved this issue for studios. It just prints in the terminal all the scenes for a studio sorted by duration so I can find a scene quickly.

import requests
from typing import Dict, List, Optional, Union
import logging
from datetime import timedelta

STASHBOX_GRAPHQL_ENDPOINT = "https://stashdb.org/graphql"
STASHBOX_API_KEY = ""
STUDIO_ID = ""

# Configure logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')

def execute_graphql_query(query: str, variables: Optional[Dict] = None) -> Dict:
    headers = {
        "Content-Type": "application/json",
        "ApiKey": STASHBOX_API_KEY
    }
    payload = {"query": query}
    if variables:
        payload["variables"] = variables

    logging.debug(f"Executing GraphQL query: {query}")
    logging.debug(f"Variables: {variables}")

    try:
        response = requests.post(STASHBOX_GRAPHQL_ENDPOINT, json=payload, headers=headers)
        response.raise_for_status()
    except requests.exceptions.RequestException as e:
        logging.error(f"Error executing GraphQL query: {e}")
        raise

    logging.debug(f"Response: {response.json()}")
    return response.json()["data"]

def get_studio_scenes(studio_id: str) -> List[Dict]:
    query = """
    query Scenes($input: SceneQueryInput!) {
        queryScenes(input: $input) {
            count
            scenes {
                id
                release_date
                title
                duration
            }
            __typename
        }
    }
    """

    page = 1
    total_scenes = float("inf")  # Initialize with a large value
    all_scenes: List[Dict] = []  # Create a list to store all fetched scenes

    while True:
        variables = {
            "input": {
                "direction": "DESC",
                "page": page,
                "parentStudio": studio_id,
                "per_page": 100,
                "sort": "DATE",
            }
        }

        logging.debug(f"Fetching scenes for page {page}")
        try:
            result = execute_graphql_query(query, variables)
        except Exception as e:
            logging.error(f"Error fetching scenes for page {page}: {e}")
            break

        scenes = result["queryScenes"]["scenes"]
        total_scenes = result["queryScenes"]["count"]
        logging.debug(f"Page {page}: Fetched {len(scenes)} scenes, total scenes: {total_scenes}")
        all_scenes.extend(scenes)  # Append the fetched scenes to the all_scenes list

        # Break if we have fetched all the scenes
        if len(all_scenes) >= total_scenes:
            break

        page += 1

    return all_scenes

def format_duration(duration: int) -> str:
    return str(timedelta(seconds=duration))

def main() -> None:
    scenes = list(get_studio_scenes(STUDIO_ID))

    if not scenes:
        logging.warning("No scenes fetched")
        return

    scenes.sort(key=lambda x: x['duration'] or 0, reverse=True)

    for scene in scenes:
        print(
            f"Duration: {format_duration(scene['duration'] or 0)}",
            f"Release Date: {scene['release_date']}, "
            f"Title: {scene['title']}, Scene ID: {scene['id']}"
        )

if __name__ == "__main__":
    main()
stg-annon commented 2 weeks ago

Identify is the task you're looking for to bulk match scenes from a stash-box it will match based on PHASH/duration

If you want to limit it to a specific studio you can create a scene filter for that studio or go to the studio page select all scenes click the ... menu again then select Identify this will pull down all the metadata you specify in the task

8ullyMaguire commented 2 weeks ago

This is still necessary for the scenes that aren't automatically matched when using the Identify task.

stg-annon commented 2 weeks ago

ahh I misunderstood you're looking to bulk create scenes without files then?

8ullyMaguire commented 2 weeks ago

Yes.