RigsOfRods / rigs-of-rods

Main development repository for Rigs of Rods soft-body physics simulator
https://www.rigsofrods.org
GNU General Public License v3.0
1.01k stars 175 forks source link

Created unified text file tokenizer #2953

Closed ohlidalp closed 1 year ago

ohlidalp commented 1 year ago

We have multiple formats with very similar syntax: rig def (truck), odef, tobj, character (see #2942). This new parser supports all these formats and adds new features:

Update: I added AngelScript bindings and extended the bundled 'demo_script.as' to showcase them. There's a new button "View document" (changes to "Close document" after pressing) which opens separate window with a syntax highlighted truck file. Update2: all glitches were fixed. It's ready for review.

New script API:

To test, open game console and say loadscript demo_script.as. obrazek

ohlidalp commented 1 year ago

When I'm looking at the serialization loop I wrote, I realize that the way you tell good code isn't about diving into the details, but rather looking from bird's perspective where you can barely read the characters, then observing the overall shape and using just common sense to figure out what it probably does. If the above journey is a pleasure to do and the conclusion you arrive at turns out to be pretty correct, you've met good code. obrazek Yes, I've had a 🤓 moment, judge me all you want.

ohlidalp commented 1 year ago

Purpose: to be able to read, edit and export any fileformat (TRUCK DEF, ODEF, TOBJ, CHARACTER) directly in game using AngelScript.

Estimate: Blocked by #2930. As soon as #2930 is done, this will take at most 1 man day to finish.

Work to be done:

CuriousMike56 commented 1 year ago

Viewing truck file of any 'complex' vehicle -> ~30 fps drop: RoR_2022-12-10_20-27-55 No frame drops with the DAF Semi. Also much of the truck file doesn't match the actual file: 2022-12-10_20-30-17

ohlidalp commented 1 year ago

@CuriousMike56 Thanks for testing.

Viewing truck file of any 'complex' vehicle -> ~30 fps drop:

Makes sense, it's because the script traverses the entire document every frame, so it's combined overhead of DearIMGUI and string allocation by angelscript. I'd have to optimize it like in Console UI - artificially scale the scrollbar and only populate the visible part of the document. But I'd like to do it in another PR since it's just a demo.

Also much of the truck file doesn't match the actual file:

Interesting, I see 3 problems:

ohlidalp commented 1 year ago

Truck definition format parsing is complete.

Note the option 'ALLOW_NAKED_STRINGS' must be on at all times, because otherwise the parser requires all strings to be in quotes. Also see these special cases:

  1. The first line is the actor name, with spaces and special characters, possibly including multiple quotes. I added an option "FIRST_LINE_IS_TITLE" which parses it as one string. obrazek
  2. in forset, single node numbers are parsed as NUMBER. Node ranges are not valid numbers so they decay to STRINGs. obrazek
  3. The axle and interaxle keyword have a quirky abc(1 2) syntax. I added an option 'PARENTHESES_CAPTURE_SPACES' which make the whole expression parse as one string. obrazek
ohlidalp commented 1 year ago

Originally I only planned to cover fileformats which are similar to each other (truck, odef, tobj) but now I wanted the demo script to display TOBJ files too, but I didn't want to clog the Terrain API with getFileWhatever() funcs. I wanted to use the GenericDocument to parse the TERRN2 file, get TOBJ filenames from there and parse those as well. It turned out to be pretty straightforward, the tokenizer is already robust so it could take a few extra options.