Open WithoutPants opened 3 years ago
A few thoughts i had concerning the import CSV functionality. (Didnt mention in the PR since they might be out of scope)
Plain
users shouldnt be able to overload the server resourcesstrings.HasPrefix
to check for the [
and attempt to json parse the record value automatically. CSV files like that are what Webscraper produces (i posted some actual files in the stash-box discord channel around January). Scrappy btw (haven't actually used it) seems to use the delimiter approachIs there any mechanisms in place to handle deduplication?
If not - I would suggest using canonical URL for duplicate detection for scenes pages available on the web from the original publisher, and for older content - use set/scene/movie links on the indexing sites (this sites could be also used to strengthen duplicate rejection when publishers are recycling content).
Problem
Need to provide a native data import process which handles mapping to other options. The process should allow for the creation of complete scene data without needing to perform post-import modifications.
Proposal
This RFC is based on the previous work in the branch at https://github.com/InfiniteTF/stash-box/tree/bulk-import
Here is the basic requirements I think are needed for the mvp import functionality:
Here is how I envision the import process to work:
Other considerations
Implementation details
Database schema
Add new
import_data
table:user_id
- assuming a single import per user, this plusrow
should be sufficient composite keyrow
- index of csv or jsondata
- ajsonb
encoded of the raw row dataIf we need to identify stale imports, then we'll need a separate table to identify pending imports, with a user and date.
Graphql schema
submitImport
mutationdata
file upload, a datafiletype
, and a list offieldMapping
objectsImportFieldMappingInput
includesoutputField
, optionalinputField
, optionalfixedValue
, optional list of regex replacementsqueryImportData
query, accepting page size and page. Returns parsed import data (does not translate tag/studio/performers).importMappings
query.ImportMapping
object, which containsstudios
,performers
,tags
fields, which are lists of name/id pairscompleteImport
mutationImportMappingInput
object which is basically the same structure asImportMapping
abortImport
mutationGUI changes
Import
top-level menu item for allowed usersExtensions for future iterations