lgblgblgb / xemu

Emulations (running on Linux/Unix/Windows/macOS, utilizing SDL2) of some - mainly - 8 bit machines, including the Commodore LCD, Commodore 65, and the MEGA65 as well.
https://github.com/lgblgblgb/xemu/wiki
GNU General Public License v2.0
201 stars 31 forks source link

MEGA65: Check if memory is initialized before used #405

Closed Rhialto closed 1 month ago

Rhialto commented 1 month ago

Is your feature request related to a problem? Please describe. In https://github.com/MEGA65/mega65-rom-public/issues/153 it seems that at least the BASIC part of the ROM may depend on memory to be zeroed on power-on. A reset doesn't wipe memory, and BASIC apparently doesn't wipe all the locations it is using. So if one of those gets an unexpected value, it may survive the reset and cause trouble.

Describe the solution you'd like It could be useful to add a bit to every memory byte to remember if it has been written to. If the memory is read before this bit is set, issue some kind of diagnostic.

Describe alternatives you've considered Alternatively initial memory contents could be randomized, so as to trigger bugs sooner. This is probably a lot harder to use effectively to find bugs.

lgblgblgb commented 1 month ago

@Rhialto Hi, thanks for you suggestion. I am not sure how I can implement this in an efficient way. Adding this directly as is (which wouldn't be hard), would result a huge slowdown of the emulation, even if the feature is not used, there is at least one extra condition on every memory read, even with just executing NOPs (makes no sense) it means 40million extra operations per second, which is much more of course if "sane" code is running which does memory operations at their own as well, other than reading memory for opcode fetch. This is the code area of Xemu where a single "if" hinted with gcc/clang specific likely/unlikely can even mean 5% performance difference without actual code change, just hinting the likelihood to take or not take the branch.

As far as I see, in the future a feature like this can be implemented with dynamic memory resolving re-mapping feature, what I plan for debugging purposes. It means that if not used, it does not consume any emulation resource at all. However till it's ready, it take a while and it's really hard to do in general, one of the greatest "mega projects" (hehe, now not in the sense that mega = MEGA ...) within the Xemu project, only some hints:

For more information: https://github.com/lgblgblgb/xemu/issues/209 Also: https://github.com/lgblgblgb/xemu/issues/378 ... and: https://github.com/lgblgblgb/xemu/issues/11

Unfortunately the weakest point if Xemu currently maybe any debug feature, thus no wonder, it's problematic to do.

The other part of the suggestion (using random initialized content) is easier of course, and in fact quite trivial but not as effective for using bug-hunting, as you also stated.

At the other hand, it's possible to have the first approach still but not in the default build: if someone want to build Xemu as their own, they can set some option to build a debug-aware version of Xemu with this feature enabled, though having slower emulation then always, of course. It can be still useful for people having deeper knowledge on compiling software and hunting for MEGA65 software bugs.

Conclusion after the endless flow my mega blah-blah above

  1. The "random memory content reset" is quite trivial to do
  2. The conditional "by recompile Xemu" support is a bit harder but can be done, though to use it, users must re-compile Xemu at their own
  3. The dynamic-reconfiguration super-debug (etc etc etc, other superlatives must be inserted still) may allow to do it without any effort from the user, but it takes (maybe a long) time, when that time comes ...
lgblgblgb commented 1 month ago

An early experiment with this idea that may (or not ...) help to solve that ROM issue: https://github.com/MEGA65/mega65-rom-public/issues/153#issuecomment-2267662764

lgblgblgb commented 1 month ago

@Rhialto

OK, it seems, I can probably add this (to next branch of Xemu at least) in a relative easy way, what I haven't thought of before. This is because next already uses the new memory decoder. The limitation that without further complications, I can only add monitoring such accesses for the first 126K of "fast RAM" (not 128K, since the last 2K of the 128K is the C65 colour RAM). But I guess for ROM development/debugging (what it is more useful at all in the current form) it should be enough. The next 128K is the ROM itself, so it does not make sense (even if the user asks hyppo for writable ROM and re-use ROM space, the hyppo already filled all the ROM in, so there was pre-write for every bytes there, there will be no warning at all!). The last 128K after the ROM seems to be used by ROM to test things first, walking through the memory, so it's always gives a tons of warning without real meaning. Also, that area mostly used to allocate graphics things with MEGA65 ROM. So I guess, the first 126K makes sense, and "safe" for catching bugs since the last reset.

Btw, this was the method I used to generate the result with BOOT what I've quoted on the link can be found in my previous comment above.

Also, next will be merged to master (aka "stable") "soon", so not a big issue that only next will get it.

lgblgblgb commented 1 month ago

Done (currently - as I write this - only in the next branch) in https://github.com/lgblgblgb/xemu/commit/bd47ac995d3c9c80e6d20412eb6d9c90180e2237

Rules:

  1. Only works on the first 126K of physical RAM
  2. Only works from command line, start emulation with parameter: -ramcheckread
  3. You need to watch the output of the emulator, so on UNIX-like/Linux/Mac systems, emulator should be started from terminal window, on Windows, Xemu console must be open (start also with parameter: -syscon)
  4. Check the output of the emulator, you can find lines like: MEM: DEBUG: main RAM at linear address $101A0 has been read without prior write; PC=$8613 [$20613] The PC value can be off by some bytes, because it may have been incremented already during opcode emulation. The [...] is the linear address for the mentioned CPU PC value.

This is a bit "hacky" stuff, and not an integral part of the always-mentioned-any-over-discussed-future-plan of the new debug architecture. So probably this will be re-implemented later in a more sane form.