AnvilOnline / Anvil

Main project planning and layout, general topics about the project should be discussed here, no code should be pushed here.
6 stars 0 forks source link

Apex planning #2

Open kiwidoggie opened 8 years ago

kiwidoggie commented 8 years ago

Brain dump here, there are no wrong idea's here, we will form our build and design around the requirements that are discussed here.

kiwidoggie commented 8 years ago

There should be a reasonable BSP viewer along with object viewer/editor. I personally want this to be a "what you see is what you get" style editor. Then that will pretty much remove the need for forge mode, even though it would be nice to have both :dancer:

kiwidoggie commented 8 years ago

I think for in-development mods there should be a git-style revision system. Not as complicated, but to at least track changes in a tag over time if needed.

camden-smallwood-zz commented 8 years ago

I think the editor should avoid editing cache/vanilla data at all costs. We should implement a virtual file system using the concept of tag files as a base, maybe adding editor functionality to create new objects based on existing tag files or vanilla cache data.

camden-smallwood-zz commented 8 years ago

If we plan on using simple scripts for the approach of modding vanilla tags, we could add the ability to export a script based on the changes made to a cache/vanilla tag from the editor. This would be a simple approach, as it would not require modifying the data permanently and could be synced (in some way) at each match start.

camden-smallwood-zz commented 8 years ago

We should include straightforward facilities for asset compilation and manipulation: Bitmaps, Models (hlmt/mode/coll/phmo), Animations (jmad), Sounds, Structures (bsp/collision/lightmaps)

Ernegien commented 8 years ago

I absolutely despise using file-relative offsets for anything when dealing with memory because I just end up having to manually convert to and from them while researching and it can kill productivity. A better approach in my opinion would be to use image addresses (like they would appear in IDA) in the code-base, then translating to virtual memory addresses behind the scenes via a utility class.

See the following link for my ideal implementation in C#; an object that gets initialized with a static image address, and then becomes a process address for all further access.

http://pastebin.com/NezyDSj7

The major downside that I can see with this particular implementation is that it needs to maintain a global state of the image and process base addresses in order to perform the translation, but I think it's worth it compared to the alternative of passing that info in each time using a utility function versus treating it like a regular value via operator overloads.

Ernegien commented 8 years ago

For things that we can't easily pattern match on, we'll need a generic container to hold version-specific information. In it's simplest form, I've come up with the following for my research test-bed in C#.

http://pastebin.com/j6DFidxw

The selling point in this implementation is the ability to pass in the currently-loaded game version during creation and keying off of that by default, which reads cleaner when accessed elsewhere.

Ernegien commented 8 years ago

It looks like the pattern-matching code is already pretty robust, but there's always room for improvement, and these suggestions below should handle the remaining 5% (bullshit statistic ^_^) use-cases that yours currently does not from what I can tell so far.

We'll need the ability to specify a mask byte array of values instead of 'x' characters to work with data on the bit level. This will allow for matching opcodes that have register info combined with them, like "push ebx" for example, which is represented by a single byte of 0x53 (push 0x50 | ebx 3). With our search pattern we could then pass in a mask byte of 0x50 to match a pop instruction regardless of what register it actually deals with (which is likely information to change throughout different builds). Also, where masking isn't needed, we shouldn't make the caller pass one in anyways as that just adds to code clutter. We could just substitute 0xFF on the fly for the mask during each byte's value comparison internally.

// where pattern and patternMask are byte[] arguments, and i is an offset relative to the pattern you're scanning for
byte maskByte = patternMask?[i] ?? byte.MaxValue;
if ((pattern[i] & maskByte) == (actual[i] & maskByte))
{
    // match!

There will be cases where a scan direction flag will become extremely useful. For instance, say I want to hook an entire function. I might be able to easily match inside of it, but don't know how to reliably get the entry point address due to the beginning portion varying too much between versions. I'll want to do a reverse scan from the match result mid-function to the prologue's pattern to get the function's base address, coupled with a max scan distance for optimization purposes so it's not wasting time scanning way past what it should in case it can't find what it's looking for.

A range argument with min/max bounds would also be helpful. Specifying a list of ranges for multi-region scanning might also be useful, unless you want the caller to deal with this logic instead through multiple FindPattern calls.

Determining where to scan can be optimized a bit using a flags argument for predefined types of game memory. We know (or can easily learn) where the code/tags/globals/resources get loaded etc. There's no need to scan all of that in most cases if we just map them out on init and then step around them while scanning.

Having some containers to chain multiple FindPattern/PatchX calls as an ordered list of steps might also be useful eventually, similar to what emoose did in the recode with patch sets.

A built-in disassembler that could be consumed via standard regex would also be amazing (reaching pretty high here). We'd want to use sparingly in last-ditch efforts for obvious performance/complexity reasons. Hopefully this is never needed however. Multiple FindPattern calls relative to each other will be the likely substitute for this.

Ernegien commented 8 years ago

Rather than have a bunch of FindPattern calls scattered throughout, it would probably be best to batch these together as much as possible on init so we're not scanning through the same memory multiple times. We'd need a scanner class/utility function capable of processing multiple search patterns simultaneously. For things we know to be static we could even automate the creation of c++ code representing the version-specific containers referenced above. For example, dump these search patterns into an intermediate file and have some generator shit out a bunch of c++ code as a build step.