stashapp / stash-box

Stash App's own OpenSource video indexing and Perceptual Hashing MetaData API
MIT License
216 stars 61 forks source link

[RFC] Scene import process #116

Open WithoutPants opened 3 years ago

WithoutPants commented 3 years ago

Problem

Need to provide a native data import process which handles mapping to other options. The process should allow for the creation of complete scene data without needing to perform post-import modifications.

Proposal

This RFC is based on the previous work in the branch at https://github.com/InfiniteTF/stash-box/tree/bulk-import

Here is the basic requirements I think are needed for the mvp import functionality:

Here is how I envision the import process to work:

Other considerations

Implementation details

Database schema

Add new import_data table:

If we need to identify stale imports, then we'll need a separate table to identify pending imports, with a user and date.

Graphql schema

GUI changes

Extensions for future iterations

bnkai commented 3 years ago

A few thoughts i had concerning the import CSV functionality. (Didnt mention in the PR since they might be out of scope)

laurus-lx commented 3 years ago

Is there any mechanisms in place to handle deduplication?

If not - I would suggest using canonical URL for duplicate detection for scenes pages available on the web from the original publisher, and for older content - use set/scene/movie links on the indexing sites (this sites could be also used to strengthen duplicate rejection when publishers are recycling content).