Mutagen-Modding / Mutagen

A .NET library for analyzing, creating, and manipulating Bethesda mods
GNU General Public License v3.0
121 stars 32 forks source link

ESP compiler/decompiler #281

Closed samdeane closed 1 year ago

samdeane commented 2 years ago

Just wanted to point you at this project.

It's not directly related, and I'm using my own code to parse ESPs.

However the idea of a compiler/decompiler is really independent of the ESP backend, and is something that would benefit from a bit of discussion and peer review. Is it worth pursuing? Does the proposed format make sense?

The ultimate goal is that an entire mod project (esp records, resources, and scripts) could be contained in a structured source directory, and tools could automatically package them up as BSAs and ESPs, including compiling the scripts.

Noggog commented 2 years ago

Hey! Sounds very similar to the thought processes I had when starting Mutagen. Way back in the day, I had planned to have conversion into json/xml format for the purposes of storing mods in a git repository be the introductory proof of concept project, before Synthesis and Analyzers cut in line and took over in the work queue.

I do think it's a worthwhile endeavor to get mods into source control. A big part of that would be two-way converting esps to a text format like json/xml/other. Being able to leverage Git would give developers a lot of power to experiment, branch, as well as potentially even merge conflicts with other developers in a way that I imagine is very tough right now with just storing/transferring zip files with the content.

Mutagen Fulfillment Musings

As I mentioned, I do/did plan on giving the concept a stab one of these days. As a short blurb on how I would go about it if the time came:

Mutagen exposes classes/interfaces like these:

public interface INpcGetter
{
    ITranslatedStringGetter? Name { get; }
    float Height { get; }
    float Width { get; }
    IVirtualMachineAdapterGetter? VirtualMachineAdapter { get; }
    IObjectBoundsGetter ObjectBounds { get; }
    // ...
}

The parsing for how to expose that interface is handled for the user. They just get an object with strongly typed fields to use without any worry about how it's fulfilled. Notice, too, concepts like FULL being associated with Name is abstracted away as an esp format implementation detail.

Obviously we wouldn't want to hand-write the conversion from these interfaces to json, so either:

Some other items to consider that Mutagen brings to the table for a project like this:

FormID Collisions

The talk so far has been about converting json<->esp, but I think there's a few more aspects to consider.

What happens when modder A adds 10 records, and modder B adds 10 records. They convert to json and commit into Git. They then push up and merge to master. Well, the merge might go cleanly according to Git as far as the text file goes (two sections of lines added without conflict), but now the esp will have duplicate FormIDs, as both sides probably allocated the same IDs to their new records.

As such, a tool that is empowering mod systems to be stored in Git might also want to offer a Merge helper tool. This would essentially notice that the mod on branch A and branch B have FormID conflicts. It could then load in either A or B, and remap the FormIDs to new values that haven't been taken. This might be powered by the mod loading/inspection/remapping/persistence tools mentioned earlier.

Probably a good deal of other similar "additional" checks/features that a tool like this might want to offer

Trailing Blabber

I'm excited someone is interested in these concepts! I think it would be a large boon to the community to get a tool like this on the ground

All I'll say on that front is that I did develop Mutagen with a lot of care towards not just making it contain features that suit my personal project needs, but attempting to make it a generic and accessible and powerful baseline for downstream projects of all kinds. The hope was that people could focus on cool downstream projects and make a quicker impact with a modern dev ecosystem without reinventing the wheel for the low level gritty details.

As you mentioned, serialization/processing isn't owned by anybody. There's already several that exist (xEdit/SkyProc/Esper/Im sure others). Mutagen is just another. If you love Swift and want to develop a tool on top of that, that's awesome as well. 8)

As far as my feedback on the end goal idea: I think it's super interesting, important, and likely doable one way or another

samdeane commented 2 years ago

Thanks a lot for engaging! Likewise it's nice to know that I'm not the only one thinking about this stuff...

I think my SwiftESP library is analogous to Mutagen in that it aims to provide a strongly typed data structures / language bindings for each of the record/field types. It uses Swift's native serialisation mechanisms to read/write these in binary format*.

(*It's a shame I guess that I'm writing in Swift, which is a bit esoteric from the point of view of the modding community. I'd be comfortable in C# or anything else, but Swift is what I do day in day out, and I'd rather focus my efforts on doing something, not on learning another ecosystem :).)

The conversion to/from JSON (or whatever I settle on) is conceptually a layer above SwiftESP, and part of the compiler itself. This is where in theory Mutagen would work just as well:

Compiler

Currently the JSON conversion actually lives in SwiftESP, but it will be refactored. It also uses Swift's native serialisation mechanism to read/write the JSON, and like you I'm aiming to optimise the text version to be as compact as possible; simplifying away a lot of the gnarly legacy details of the binary format, and wherever possible providing default values so that you only have to explicitly define properties that differ from the default.

Most of the other features of Mutagen you mentioned are things I'd also like to add. Some logically live in SwiftESP, some in the compiler, but where possible I'd like to keep most of the linting/correctness style stuff at a higher level, which only operates on the text format. It would help to decouple everything and reduce the scope of those tools to something manageable:

Ecosystem

In my mind all of this is a step towards a more complete "project" format which would encompass esp records, 3d resources, and scripts, and would build out to ESP/BSA/PEX etc.

Project

You could imagine that sort of tool also having integrations into VSCode, so you can literally just hit Shift-B and have the whole chain kick off and spit out the built mod.

samdeane commented 2 years ago

but where possible I'd like to keep most of the linting/correctness style stuff at a higher level

Having said that, of course, many of those tools require a rich understanding of the data graph, which is what Mutagen/SwiftESP gives access to, so maybe I'm barking up the wrong tree a bit there.

I'd still like to decouple them though from the main parser. Small and modular FTW...

samdeane commented 2 years ago

I hadn't looked at Synthesis & Analyser - they both look interesting too.

On the topic of handling mod patches and collisions, and other things of that ilk, have you seen this. I came across it when looking for similar efforts to (de)compile ESPS, but it looks to be focussed more at a being a better way to combine small edits. Which is a worth goal in itself, and also feels like it could be built on top of a common file format / framework.

samdeane commented 2 years ago

One of things I struggle with is figuring out where to talk about this stuff. Discord is great, but the ability of everyone to make their own space means we've got potentially hundreds of little atomised communities.

I feel like it would be really useful if there was one place to go where you might be talking to an audience that included authors of key tools like xEdit, Bodyslide/Outfit Studio, FNIS/Nemesis, key enabling technologies like SKSE and the various body/skin frameworks, major mods, and also at least some of the main creators of clothing/armour and content. Also people working on stuff like the OpenMW and Skyrim Platform.

Does anywhere like that exist?

It's clear that a bunch of people will have thought about this stuff over the years, and I'd like to tap into their wisdom. Even if we end up all doing different implementations, agreeing on the essential features that a text representation would need (if not an actual format itself) would be valuable exercise in its own right.

Noggog commented 1 year ago

Sorry I didn't respond. I imagine I got distracted with other stuff, haha. Yeah, always hard to find the right community spaces that overlap in the right ways. I dont know of a single unified place either.

Anyway, I'm going to close this issue, as it's not something that needs work to be done within the Mutagen project. Feel free to swing by the Mutagen discord if you want to chat in a more back and forth manner. Cheers!

samdeane commented 1 year ago

No problem - I got distracted too 🙃. I am still theoretically interested in it all and might pick up again one day. Partly I'm waiting to see if Starfield advances the state of the art at all (though I am not holding my breath).