Allow acessing DynamicSprite data as array

adventuregamestudio / ags

AGS editor and engine source code

Other

695 stars 159 forks source link

Allow acessing DynamicSprite data as array #1997

Open ericoporto opened 1 year ago

ericoporto commented 1 year ago

Describe the problem I have previously had the need to access data of individual pixels in sprites in AGS, either because images are convenient when storing information (I rented a boat Height Maps or Galleys maps), or because I wanted some particular effect (normal maps). GetPixel and SetPixel methods are unfortunately slow for this - even if they didn't had boundary checks or vtables, it's still a per pixel call to an "external" function, so there will be some overhead - so you can only do few thousand (~2048) GetPixel per frame, without causing slowdowns, requiring one to divide the load. Writing pixel information is even slower. Pixel manipulation could also produce graphic effects (a better distfx)!

Suggested change It would be nice to retrieve the bitmap data from inside the DynamicSprite (or Surface?) directly as an array. Though, in AGS, dynamic arrays must have headers, and cannot point to any random memory.

If we want to have only reads, maybe as a copy, optionally copying them back on "Release" call, or some kind of SetPixelsBack call.
- this may be easier to code
if we want it to be read and written, and written fast, the array could be the data itself, with shared ownership.
- it is possible to avoid extra dereferencing, and possible to give out raw pointer to script vm too, as dynamic objects are already stored in pairs of data address / manager, and manager can be an object itself.
- there will be a little extra memory for allocation
- there may be ways of optimizing this by using memory pools or allocating dynamicobject class and array data on the same buffer.

The API for this could be, depending on either convenience or going for the more generic solution, something like

import char[] GetPixelArray(); // if we are talking about 32-bit color, this retrieves per color component (4 * w * h)
import short[] GetPixelArray16(); // classic ags color is 16-bit
import int[] GetPixelArray32();

I talked previously about t his with @ivan-mogilko , I hope I added the things mentioned correctly! It's possibly necessary to observe #1980 when implementing this.

ivan-mogilko commented 1 year ago

To clarify about DynamicArray a bit; I thought it's possible to make it wrap any random memory, and mark as "shared", so that it won't delete that data on disposal. The issue with DynamicArray currently is that it allocates meta data along with the elements on a single buffer. So it goes like [HEADER][ELEMENTS] in memory.

If we want it to be able to "wrap" another memory with it, it will have to be remade into a struct similar to the current ScriptUserObject, where meta data is one struct, and element container is a separately allocated buffer.

Technically, there is a way to avoid having to do extra dereferencing when reading and writing this mem, as we may register address to elements data as a "managed object" and DynamicArray struct as a "manager" pointer, this way we will have an immediate access to both.

The fact that there will be extra allocation still stands, but, as mentioned in the issue's description, there are potential solutions to optimize that too, if it is thought to be necessary. (Same could be done for ScriptUserObject struct, perhaps)

By the way, if this is done, then we may use same dynamic array wrap for any similar purposes. What first comes into head is reading/editing sound samples, for instance (generating sound waves in script).

ivan-mogilko commented 1 year ago

Crude experiment: https://github.com/ivan-mogilko/ags-refactoring/tree/experiment--scapi-pixelarray2

Script example:

function on_key_press(eKeyCode k)
{
    if (k == eKeySpace)
    {
        DrawingSurface *ds = Room.GetDrawingSurfaceForBackground();
        int pixels[] = ds.GetPixelArray32();
        for (int y = 0; y < ds.Height; y++)
        {
            for (int x = 0; x < ds.Width; x++)
            {
//                pixels[x + y * ds.Width] = Game.GetColorFromRGB(Random(255), Random(255), Random(255));
                pixels[x + y * ds.Width] = ((Random(255) & 0xFF) << 16) | ((Random(255) & 0xFF) << 8) | ((Random(255) & 0xFF));
            }
        }
        ds.Release();
    }
}

ISSUES: DrawingSurface is marked as "modified" upon GetPixels call, as it cannot know what you are going to do with it. If AGS syntax supported "const array", we could have 2 separate functions for getting mutable and immutable array.

This whole thing is actually unsafe, because you may keep array after disposing off DrawingSurface, and cause weird things. If this belonged to a DynamicSprite, you may cause memory access exceptions. Same will happen if you store pixel array from room background in a global variable, and do something with it after changing rooms.

Because of how compiler works, there's still an extra CHECKBOUNDS instruction on every array access.

Shared array serialization does not work, as script object does not know how to retrieve the array data again. I see two options:

Identify the source somehow, similar to how DrawingSurface records where it got the bitmap from.
Don't care and require user to get array again after restoring a save.

Because AGS stores colors in a weird 16-bit way, this will work best after #1980. Right now, apparently, you'll have to make color values yourself with bitwise operations (above script does not produce correct pixel values).

ivan-mogilko commented 1 year ago

Ah, but of course. The DrawingSurface might keep the reference to this array, and invalidate it as it gets disposed or invalidated itself (on dynamic sprite deletion).

@ericoporto but I think at this point most interesting question is how much faster working with this array really is. Because if it turns out to be not too fast either, the whole thing would be of dubious value.

ericoporto commented 1 year ago

Edit: Need to redo all using only int, this would eliminate api function calls, I forgot those conversions were API. 😬

ArrayMode7.zip

@ivan-mogilko I attached a quick project above, I cobbled up together a mode7 implementation that uses either Get/Set Pixel or the Array access.... I also added both variants pre-compiled in an ags file.

most interesting question is how much faster working with this array really is

so... This is a bit frustrating. Unless I am doing something wrong there is no fps difference at all.

Edit: forgot float to int and int to float were API calls, need to eliminate those in the code later when I am on a PC. They are the main cost of the calculations.

Because AGS stores colors in a weird 16-bit way,

Colors are 32-bit in the array. Or I am wrong in how to understand things here?

Need to figure out something else to test this - but I didn't thought that the calculation would be that expensive it looks really minimal.

``` void render_mode7(){ z = FloatToInt(F_HEIGHT_CENTER* -1.0); float cos = Maths.Cos(Maths.DegreesToRadians(angle)); float sin = Maths.Sin(Maths.DegreesToRadians(angle)); for (float y = 0.0; y < F_SCR_HEIGHT; y+=1.0) { for (float x = 0.0; x < F_SCR_WIDTH; x+=1.0) { // Get x and y in the texture if(z==0) z++; _y = (((F_SCR_WIDTH - x) * cos - (x) * sin)) / IntToFloat(z); _x = (((F_SCR_WIDTH - x) * sin + (x) * cos)) / IntToFloat(z); if (z < 0) { _x -= moveX; _y -= moveY; } else { _x += moveX; _y += moveY; } // x and y in texture always positive if (_y < 0.0) _y *= -1.0; if (_x < 0.0) _x *= -1.0; // Repeat the pixel on the texture int itx_x = FloatToInt(_x) % I_TEXTURE_WIDTH; int itx_y = FloatToInt(_y) % I_TEXTURE_HEIGHT; int isc_x = FloatToInt(x); int isc_y = FloatToInt(y); // actually draw pixel from texture on screen dyn sprite arr_scr[isc_x + isc_y*I_SCR_WIDTH] = arr_texture[itx_x + itx_y*I_TEXTURE_WIDTH]; // surf_scr.DrawingColor = surf_texture.GetPixel(itx_x, itx_y); // surf_scr.DrawPixel(isc_x, isc_y); } z++; } } ``` Maybe the replacement is just ``` oid render_mode7(){ z = FloatToInt(F_HEIGHT_CENTER* -1.0); int cos = FloatToInt(1024.0* Maths.Cos(Maths.DegreesToRadians(angle))); int sin = FloatToInt(1024.0* Maths.Sin(Maths.DegreesToRadians(angle))); for (int y = 0; y < I_SCR_HEIGHT; y++) { for (int x = 0; x < I_SCR_WIDTH; x++) { // Get x and y in the texture if(z==0) z++; _y = (((I_SCR_WIDTH - x) * cos - (x) * sin)) / z; _x = (((I_SCR_WIDTH - x) * sin + (x) * cos)) / z; if (z < 0) { _x -= moveX; _y -= moveY; } else { _x += moveX; _y += moveY; } // x and y in texture always positive if (_y < 0) _y *= -1; if (_x < 0) _x *= -1; // Repeat the pixel on the texture int itx_x = (_x/1024) % I_TEXTURE_WIDTH; int itx_y = (_y/1024) % I_TEXTURE_HEIGHT; int isc_x = x; int isc_y = y; // actually draw pixel from texture on screen dyn sprite arr_scr[isc_x + isc_y*I_SCR_WIDTH] = arr_texture[itx_x + itx_y*I_TEXTURE_WIDTH]; // surf_scr.DrawingColor = surf_texture.GetPixel(itx_x, itx_y); // surf_scr.DrawPixel(isc_x, isc_y); } z++; } } ```

ivan-mogilko commented 1 year ago

Hm, perhaps a simpler test, that would only rely on setting or getting pixels, would show the diff better.

ericoporto commented 1 year ago

ArrayMode7.zip

Made a more optimized version, hit space to alternate between both approaches, its around 6 fps difference (20->26) on my computer (erh, a different one from my regular) between get/setpixel and the array method.

Also I have a build of your branch here: https://cirrus-ci.com/task/4594423196024832

Because of how compiler works, there's still an extra CHECKBOUNDS instruction on every array access.

This may not be terrible, say, if when the dynspr is deleted, if there’s a way to make the array size 0, the checkbounds could emit an error, mostly to prevent the game to be coded to depend on “undefined behavior”.

ivan-mogilko commented 1 year ago

20->26 fps is practically a 30% improvement I believe? I think the array access may be sped up slightly, I had one idea to try...

EDIT: tried, but it did not add a single fps, so it's not that important, at least not in this context.

EDIT2: as a quick test disabled bounds check, but that still did not change fps even a bit, meaning there's a serious bottleneck elsewhere, so other "optimizations" do not affect the situation much.

ivan-mogilko commented 1 year ago

I looked into the above solution again and, unfortunately, it won't work.

The current standard for ags script is to provide direct access to the memory buffer when the object is registered and passed around. This is how it's done with arrays, strings, and also structs that have actual fields (member variables, as opposed to e.g. properties). On the other hand, if the object is passed into a function as a parameter, there has to be a way to get meta data from its address. Currently, for arrays, this is achieved by prepending elements with "header". But if we "wrap" arbitrary buffer ptr, then this can no longer be done. If instead we register the "dynamic array" class object itself, and pass that around, that will only work in particular interpreter. But it won't work with plugins, which would use this API, or any other interpreter that does not have this special rule (e.g. Nick Sonneveld's interpreter rewrite). Because the plugin receiving this pointer will be assuming it's already the elements buffer (which will be wrong).

In fact, I only recently realized that UserObjects also work incorrectly in this regard, they need to be allocated similar to dynamic arrays, with header prepended to same buffer, and have buffer address registered instead of UserObject class. This is because if a plugin will use any API that returns a managed struct wrapped into UserObject (like functions returning Point*), then it must be able to access the struct's member. With current UserObject it won't be able to do so. So things should be fixed other way around, at least in ags3 branch (not make DynamicArray like UserObject, but instead make UserObject like DynamicArray).

What other potential solutions are there?

Resort to array copying. That is - copy sprite data to the managed array, work with that array, then copy it back. Or, in simpler situations, it may be only one way copy: from the array to sprite, if you dont need to know which pixels were on sprite previously.
Support pure memory pointers. I recall rofl0r added this support in his compiler. We might too, although we have to remember that this is a path into unsafe memory access, with all the nuances.
Implement some alternate "script array" structure (in both compiler and engine) which assumes wrapped pointer, has a similar syntax ([] operator), but a different rules of use... I do not have a clear picture of how this could be done atm.
Redesign the standard of accessing structs and arrays. Change how plugins work with these. For example, they might have to additionally call an interface function "GetRawData" on received object pointers.

ericoporto commented 1 year ago

Resort to array copying. That is - copy sprite data to the managed array, work with that array, then copy it back. Or, in simpler situations, it may be only one way copy: from the array to sprite, if you dont need to know which pixels were on sprite previously.

This is possibly the easiest solution here. This was what I had in mind actually, because the copy is done in a single API function call, it should not be very expensive I imagine.

ivan-mogilko commented 7 months ago

Just wanted to make a note for the record sake, elaborating on these two options that I mentioned in my comment above:

Implement some alternate "script array" structure (in both compiler and engine) which assumes wrapped pointer, has a similar syntax ([] operator), but a different rules of use... I do not have a clear picture of how this could be done atm.

Redesign the standard of accessing structs and arrays. Change how plugins work with these. For example, they might have to additionally call an interface function "GetRawData" on received object pointers.

These may be about same thing, the difference is that the first means a new and separate type(s) in the scripting language, and the second means changing how existing type(s) work.

As I mentioned above, right now the script vm and plugins expect that the pointer to any kind of a "buffer object" (array, string, managed user struct) is the direct pointer to an actual data. Interpreter secretly is aware that something may be prepended to it in memory, but that's implementation-specific and not a part of the "interface". When plugins receive the buffer object by calling an exported script function, they assume that this pointer may be accessed directly to retrieve data (array elements, string characters, etc).

With both mentioned options this rule changes, and the "buffer object" pointer is considered to be a pointer to an opaque object instead. This means that this object's memory contents are undefined, and in order to get actual data pointer you have to do certain extra operation. Since these objects are created by script vm, the vm may cast the pointer to a known internal struct and read its contents directly. It's totally up to vm's implementation how to organize these in memory, and how to speed access up. (For example, script values may be stored in pairs of 2 pointers: a pointer to a object, and a direct pointer to buffer data, where a pointer to buffer is used immediately for any memory read/write operations, while a pointer to object is reserved for operations like retrieving meta info, changing ref count, etc.)

Plugins, on another hand, would have to call some function from the engine's API that would accept this "buffer object" and retrieve a pointer to buffer (like, call GetArrayDataPtr function, for example).

But this approach should be carefully thought through, because it may block certain vm implementations, or make them harder to accomplish. E.g. if you take classic AGS vm, or Nick Sonneveld's rewrite, where all data addresses are stored directly as numbers, without use of "RuntimeScriptValue", that speeds up any pointer arithmetic, but won't let support such buffer objects without some additional effort.