TASEmulators / fceux

FCEUX, a NES Emulator
http://fceux.com
GNU General Public License v2.0
1.21k stars 251 forks source link

HexEditor slows down when many addresses are being highlighted at the same time #267

Open AleFunky opened 3 years ago

AleFunky commented 3 years ago

basically when there are many addresses being highlighted, fceux slows down

https://user-images.githubusercontent.com/55950650/102698046-f1f78900-423a-11eb-8600-b175ecf47924.mp4

mjbudd77 commented 3 years ago

Which operating system are you running on?

AleFunky commented 3 years ago

windows 10 64 bits

ClusterM commented 3 years ago

UpdateMemoryView() has performance bottleneck:

    MoveToEx(HDataDC, 8 * MemFontWidth + (j * 3 * MemFontWidth), MemFontHeight * ((i - CurOffset) / 16),NULL);
    sprintf(str,"%X", (int)(byteValue / 16));
    TextOut(HDataDC, 0, 0, str, 1);
    MoveToEx(HDataDC, MemFontWidth + 8 * MemFontWidth + (j * 3 * MemFontWidth), MemFontHeight * ((i - CurOffset) / 16),NULL);
    sprintf(str,"%X", byteValue % 16);
    TextOut(HDataDC, 0, 0, str, 1);

    MoveToEx(HDataDC,(59+j)*MemFontWidth,MemFontHeight*((i-CurOffset)/16),NULL); //todo: try moving this above the for loop
    str[0] = chartable[byteValue];
    if((u8)str[0] < 0x20)str[0] = 0x2E;
    //if(str[0] > 0x7e)str[0] = 0x2E;
    str[1] = 0;
    TextOut(HDataDC,0,0,str,1);

Everything freezes when drawing a large amount of HEX data.

owomomo commented 3 years ago

UpdateMemoryView() has performance bottleneck:

  MoveToEx(HDataDC, 8 * MemFontWidth + (j * 3 * MemFontWidth), MemFontHeight * ((i - CurOffset) / 16),NULL);
  sprintf(str,"%X", (int)(byteValue / 16));
  TextOut(HDataDC, 0, 0, str, 1);
  MoveToEx(HDataDC, MemFontWidth + 8 * MemFontWidth + (j * 3 * MemFontWidth), MemFontHeight * ((i - CurOffset) / 16),NULL);
  sprintf(str,"%X", byteValue % 16);
  TextOut(HDataDC, 0, 0, str, 1);

  MoveToEx(HDataDC,(59+j)*MemFontWidth,MemFontHeight*((i-CurOffset)/16),NULL); //todo: try moving this above the for loop
  str[0] = chartable[byteValue];
  if((u8)str[0] < 0x20)str[0] = 0x2E;
  //if(str[0] > 0x7e)str[0] = 0x2E;
  str[1] = 0;
  TextOut(HDataDC,0,0,str,1);

Everything freezes when drawing a large amount of HEX data.

I'm not sure which step slows down the drawing process most, but maybe caching some of the data to arrays and simply check them on drawing would porbably somehow release the tension of the neck?

g0me3 commented 3 years ago

bottleneck is actually in color fading not the coloring itself (and not in hex being drawn a lot) when no highlighting is enabled, the bytes redrawn only if changed. in the same situation as described above set MemView wnd start to 0x5000 (open bus area), all bytes there are normally changes, then see it working with highlight enabled and disabled. even if every single byte in the window changes between frames, they will be redrawn only once per iteration. TextOutA calls is about 32 000 / minute. when highlighting is enabled, all bytes are changes and every changed byte also faded out between changes, then TextOutA calls rocket high, as mush as 3 200 000 / minute (x100 times).

most probably you need another hexview drawing method not involving slow GDI32 win API calls so much time per frame.

g0me3 commented 3 years ago

also me cleaned MemView somehow from most of redundancy and now you may see clearly the actual problem (not this helps so much with speed though)...

owomomo commented 3 years ago

Since the bottle neck is color fading calls TextOutA too much times, I have an idea that pre-draw all the possible colors and states to a so called "hidden-canvas" and only "paste" these "already-prepared-tiles" when updating the value (sorry that I don't know how to describe it accurrately in English...), because all the tiles are already drawn, it can prevent calling functions like TextOutA to create them repeatedly. I don't know how many memory cost if such is implement and whether it can bring up the performance, I am just blowing my mind...

ClusterM commented 3 years ago

bottleneck is actually in color fading not the coloring itself (and not in hex being drawn a lot) when no highlighting is enabled, the bytes redrawn only if changed.

Without highlighting it's not so slow but pretty slow too.

g0me3 commented 3 years ago

for me no-highlighting means ful speed all the time, with highlighting there may be different cases for different games, even smb3 sometimes run at full speed depends on the hex offset and game state itseld, other games may be also fullspeed. most of the time, you don't need to watch open bus areas, and another ones, even where at least 20% of constantly changed bytes are full speed in color all the time. so maybe this is not a real problem for a real task.

as a easy workaround may suggest to return a special value in GetMemView signaling the open bus is read (eighter by checking the readfunc to be default (ANull) or continuosly check for read value equals X.DB currently) then most of people won't see any slowdowns at all ;)

Without highlighting it's not so slow but pretty slow too.

try after my opts now, me reduced at least twice a number of TextOutA calls.

owomomo commented 3 years ago

By pre-drawing all the characters to a background bitmap and BitBlit() with offsets and coordinates seems speedup the performance. The problem is, to speed up to an acceptable situation, we must cache 0x00 to 0xFF multiply all fading colors, that probably cost a lot of memory, if only cache 0-F, the total calling times of BitBlt is still high with lags, though it is significantly faster than before. I even thought of multithread, and make a spare thread to fresh the hex characters only, or change the whole hex editor part to DirectX rendering, but that would required a lot of works and straight up the code complexity. Back in the day in the first decade of 21st century, when computer is not fast like nowadays, I remember FCEUX used to slow down even when PPU Viewer and Nametable refresh rate is set the the fastest. As g0me3 mentioned, the open bus watching is not quite often in daily common use, so I think this is not that emergency. However, I still wish someone who have a brilliant idea can solve this.

g0me3 commented 3 years ago

doubt blitting pre-drawied bitmaps will cost less than call winapi functions even if with low data size. unless there are some other way to generate display lists with all color info that will be drawn at once. the main problem of gdi32 library is that you need to draw every single pice of text that uses different color with different brush.

mjbudd77 commented 3 years ago

What rate does the windows version update the hex editor display at? It seems to be excessively high. When I was writing the code for the Qt/SDL version, I settled on a 10hz update rate as anything faster was unreadable to me. Also, the Qt version runs the emulation in a separate thread than the GUI which allows for the two to run parallel (or at least mostly, there are a few critical sections that are protected by a mutex). Could the text update rates on the display be reduced or sub-banded? Can the display processing be done in parallel to allow the emulation to stay on time?

mjbudd77 commented 3 years ago

I recently got the Qt/SDL version of fceux to build in Windows and it does not have this issue since it runs the emulation and gui in separate threads. The Qt gui can do most of what the windows gui does so it may be useful for you. There is now a download link for it on the web site.