[Feature] Option to utilize JSON files metadata

cjemorton commented 1 year ago

Is your feature request related to a problem? Please describe. A Checkbox option to search search for and utilize a JSON file that contains metadata such as title etc.

Describe the solution you'd like In each video, I'd like to see a seperate option or checkbox - that toggles on and off the option to use metadata from a json file.

For example when a video file is downloaded with some tools there is an option to generate an info.json file containing metadata about the video file.

I'd like the option to toggle searching for a json file metadata on and off and then select which tags in the file to use for which field in the video. Once data is imported into the database, a simple check on scan to see if the data in the info.json file has changed and if it has update the database.

Describe alternatives you've considered

I have considered writing a scraper myself for this. I think metadata from a JSON file is common and having it automatically detected when a checkbox is selected for that video data, should be more of a core functionality - rather than a seperate scraper.

Off by default, but if selected - opens a form where you can select the JSON file - if it's not automatically found with the same name as the video file, and then select which tags to use to populate which field.

This would even allow someone to write custom json files and scrapers and have a way to populate the data by simply importing a batch of JSON files.

cjemorton commented 1 year ago

Basically the workflow I envision is.

Import video and json file into stash.
Scrape for new content.
Click on the content in stash, and select an option to use info.json metadata.
info.json files with the exact same name as the video file minus the extensions are automatically found, option to browse for an info json file is available if it's not automatically found.
A seperate directory to search for info.json files matching the filenames is available, with default being all info.json files in the same directory as their video files.
If an info.json metadata file is found - a info.json tag template can be applied to that file (one can be set as default, or custom ones can be added).
Then stash knows which tags in the info.json file to use for which fields.
Data from the info.json is populated into the stash database, and everything runs as normal.
A "Check JSON" button is available to check if the info.json file has changed when viewing the video file, or can be toggled as a batch option during scan.

This way stash can utilize the functionality from these tools that create info.json files automatically when the video files is obtained. If no info.json files is available - nothing is populated, and it defaults back to the way things are now.

By doing it this way, stash does not slow down due to having to scrape info.json requets every time a video file is loaded. All the data is stored in the database, and on scan or triggered manually per file, does stash check the info.json file to see if it has been updated. Additionally - tags like uploader fields and ids including timestamps of when file was added as well as like counts etc can be added and displayed in a frozen display only state - update the json file by scraping for it again, and those counts and bits of information can update to show current information. Users and uploaders can be linked to studios etc.

cjemorton commented 1 year ago

Additionally this would allow a simple - launch custom script - button in stash for each video. This script could do whatever backend processing written in whatever language you like. So for example - Clicking an option to launch custom script - would trigger a script to update a specific json file by scraping a url for new data.

Phasetime commented 1 year ago

Thanks for you suggestion! I think there is a guideline currently to strip the core functionality of stash to be as small as possible and elevate plugins to take on any extra. So here would be my suggestion for your idea:

Firstly I think a JSON-scraper would fill the role of metadata ingestion perfectly and the main pain points of this approach would be that it's not automated on scan and that you can't specify a custom other directory. But I'd say both could be addressed with a plugin on Scene.Create.Post hook that searches the directory of the scene's file and runs matching (JSON-, XML-, NFO-)scrapers if it finds any. You could probably set any arbitrary path in that plugins settings to search for as well.

stg-annon commented 1 year ago

I think @Phasetime has the correct approach, even if this was implemented via a core feature of stash there wont be just one JSON format that metadata will be formatted in which will require a different parser for each format which at that point is essentially a scraper but would be a separate system from scrapers which would be another thing to maintain and likely not get much attention after its initial implementation.

Curious as to what the function of unchecking this checkbox would do in your mind? would it delete the existing metadata?

scruffynerf commented 1 year ago

I've done half of this via plugin/scrapers now, and it would make sense to make a generic solution that could easily reused, perhaps with a mapping approach (ie similar to the way json scrapers work... in fact, Ideally, exactly matching how json scrapers work: Name: data.name

Now I'm wondering if a python-y scraper that basically could load up a json file, and pass it to a scraper is the way to go... then the scraper basically is dual purpose: if you give it a url based scraper, it works, and if you give a file based scrape, it works. And url based scrapes could get cached/downloaded...

scruffynerf commented 1 year ago

@WithoutPants reading Scraper dev docs it looks like there is no easy way to send the json scraper a local file (you could massage the url, but you don't get passed the full path, just the base filename, so even if a json file was stored next the original, you couldn't easily get the file:// url

Proposed fix: 1) add 'filepath' to the placeholder fields Then a scraper could just handle this directly, nothing else needed: just regex massage the filepath to point to the json file, give it as the query url (file://filepath.json) and scrape as usual. 2) add a easy(ier?) method to send the json scraper a local json file. I think the file:// method is a bit clunky, and maybe a direct path would be cleaner? Limit it within the library(ies)?

With the above, purely yml based json file handlers would be easy to write up.

cjemorton commented 10 months ago

Interesting approach guys.

I just pulled in the latest build, v0.24.3 Build hash: aeb68a58, and spun it up. The framework for community scanners and plugins looks interesting. I think you guys are probably right that this is the correct approach.

I'm still monitoring the progress on this!

stashapp / stash

[Feature] Option to utilize JSON files metadata #4265