Closed 0x000011b closed 1 year ago
Will try to get started on this today. I've never used SQL at all before though, meaning someone else may need to handle the searching and fetching.
I've opened PR #8 to handle this, pending review. Due to the fact that the function to grab personas from the VNDB is not yet implemented, I'm keeping this task under "in progress" rather than moving it to "under review".
Very old issue, inactive - closing for now.
Summary
Scope of this task is to implement support for Visual Novel data in the
data-toolbox
, augmenting it with external information sourced from VNDB.Source file formats
Each VN will be comprised of one or two files. Assuming
{title}
as the VN's title, there will be a mandatory{title}.txt
file which contains the actual script text. Here's a made-up example:The sequence of
===
characters separate episodes from each other.The VN might optionally also have a
{title}.chars.json
file, where each key is the name of a character seen in the.txt
file, and their VNDB character ID. An example:Implementation details
A
VisualNovelDataset
class should be implemented undertoolbox/datasets/visual_novels.py
, following the general format of the other datasets. It should yield individual episodes (a.k.a. sequences of dialog that have been separated by the===
lines), accompanied by the relevant characters if a matching.chars.json
is found. Feel free to structure this how you feel is best, but I recommend basing the implementation off any of the other datasets in that folder.A
VisualNovelPDM
should then be implemented undertoolbox/modules/visual_novel_pdm.py
. Again, basing off of an existing PDM is likely a good call - I'd suggest looking atLightPDM
. The catch here is that the generatedEpisode
s should contain persona data whenever possible. The way this should be done is by using the VNDB character IDs specified in the matching.chars.json
file to look up character information in the VNDB databases. These are made available for download here.What specific character data to include is still undecided, we can discuss this here or in the Matrix.