seannyD / VideoGameDialogueCorpusPublic

34 stars 2 forks source link

Stardew Valley parsing is messy #28

Open imlabormitlea-code opened 2 months ago

imlabormitlea-code commented 2 months ago

Hi, Thanks for providing this great resource! I am currently writing a bachelor thesis on social networks of NPCs in video games. I was going to choose Stardew Valley for the SNA because the NPCs are so interesting. However, I saw that you have parsed some previously publicly available data, and I only have the game data (version 1.6). I encountered several problems with your parser. First I got the following error:

   PARSING Stardew Valley
Traceback (most recent call last):
File "../VideoGameDialogueCorpusPublic/processing/parseRawData.py", line 119, in <module>
out += parseMethod(folder+"raw/"+rawFile,pp)
File "../VideoGameDialogueCorpusPublic/processing/parsers/StardewValleyParser.py", line 98, in parseFile
 out = parseDialogue(txt,fileName,parameters)
File "../VideoGameDialogueCorpusPublic/processing/parsers/StardewValleyParser.py", line 36, in parseDialogue
parts = [x for x in dialogue.split("#") if x.count("_")==0] 
AttributeError: 'dict' object has no attribute 'split'

How to reproduce: 1.Clone a fresh version of the repo. 2.Create a 'raw' folder in the Stardew Valley folder.

  1. copy 'ExtraDialogue.xnb', 'Characters.xnb', 'Events.xnb', 'Speechbubbles.xnb' from local game data
  2. convert to yaml via https://lybell-art.github.io/xnb-js/
  3. python3 parseRawData.py ../data/StardewValley/StardewValley/'

The problem was in Characters.xnb, which didn't seem to cover dialogue, so I excluded it. Then I got another error in post-processing:

PARSING Stardew Valley
    Post-processing ...
ERROR
You received 15 Parsnip Seeds!^^'Here's a little something to get you started.^-Mayor Lewis'
ERROR
. ^^Your maximum energy level has increased.
ERROR
Unop dunyuu doo pusutn snaus^Op hanp o toeday na doo smol^Vhu lonozol yenn huot olait tol
ERROR
Dear {0},^^If you're reading this, you must be in dire need of a change.^^The same thing happened to me, long ago. I'd lost sight of what mattered most in life... real connections with other people and nature. So I dropped everything and moved to the place I truly belong.^^^I've enclosed the deed to that place... my pride and joy: {1} Farm. It's located in Stardew Valley, on the southern coast. It's the perfect place to start your new life.^^This was my most precious gift of all, and now it's yours. I know you'll honor the family name, my boy. Good luck.^^Love, Grandpa^^P.S. If Lewis is still alive say hi to the old guy for me, will ya?
ERROR
Dear {0},^^If you're reading this, you must be in dire need of a change.^^The same thing happened to me, long ago. I'd lost sight of what mattered most in life... real connections with other people and nature. So I dropped everything and moved to the place I truly belong.^^^I've enclosed the deed to that place... my pride and joy: {1} Farm. It's located in Stardew Valley, on the southern coast. It's the perfect place to start your new life.^^This was my most precious gift of all, and now it's yours. I know you'll honor the family name, my dear. Good luck.^^Love, Grandpa^^P.S. If Lewis is still alive say hi to the old guy for me, will ya?
ERROR
The prismatic shard changes shape before your very eyes! This power is tremendous.^^ You've found the =Galaxy Sword= ^

This is about multiple use of the ^operator, which seems to be a bigger problem. From the choice variation file I got the impression that there should be 144 choices, I only got 15. There was no dialogue tree, no way to see the PC's possible answers. I suspected this was because a lot of dialogue data was missing from the files. So I got all the .xnb files from the game data, converted them to yaml and ran the parser again. This gave me more text data, but sometimes it was badly parsed or gave me no dialogue, but other kinds of on-screen text. In particular, the heart events were badly parsed, which is important to me. There was no logic to handle preconditions for certain NPC text. I also encountered some minor errors with this approach, mostly solved via excluding non-important files or casting some variables. I would love to hear your thoughts on these issues and if you see any ways to address them. If you have the game data you used somewhere, I would be very happy to use it for my thesis. Regards!