L-Spiro / BeesNES

A sub–cycle-accurate Nintendo Entertainment System emulator.
MIT License
42 stars 3 forks source link
emulation emulator nes nintendo

BeesNES

A sub–cycle-accurate Nintendo Entertainment System emulator.
Shawn (L. Spiro) Wilcoxen

Description

A “sub–cycle-accurate” Nintendo Entertainment System emulator with the goal of being as authentic of an experience as possible. It should look, sound, and feel like real hardware, with convincing visuals, clean and accurate audio, and real-time input response. No visual or audible delays. BeesNES also represents the under-served regions with support for a wide range of console variants, currently including NTSC, PAL, Dendy, PAL-M, and PAL-N.

Visual Samples

image image image image image image image image image image

RF Cables:
image
Composite:
image
HDMI:
image
HDMI Mod:
image

PAL (Composite:):
image
Dendy (Composite):
image
PAL-M (Brazil) (Composite):
image
PAL-N (Argentina) (Composite):
image

Audio Samples

image The top is a hardware reference recording in a test ROM. The bottom is the BeesNES audio output for the same test ROM. image Zooming in on the highest frequencies reveals that BeesNES audio is crisp and clean, with high-frequency aliasing completely eliminated. image All 3 images showcase the accuracy of the audio. Listen to MDFourier Test Audio

Videos

YouTube Video: Castlevania Demo Play (Low Noise)
Watch the video

YouTube Video: 1943: The Battle of Midway (RF Cables)
Watch the video

YouTube Video: Probotector (PAL, Composite Cables)
Watch the video

YouTube Video: Battletoads Opening (Extreme Noise)
Watch the video

YouTube Video: Double Dragon (Composite Cables)
Watch the video

YouTube Video: Akira Opening (Extreme Noise)
Watch the video

YouTube Video: Battletoads (RF Cables)
Watch the video

NTSC-CRT library: https://github.com/LMP88959/NTSC-CRT
PAL-CRT library: https://github.com/LMP88959/PAL-CRT
Persune palgen: https://github.com/Gumball2415/palgen-persune

Accuracy

We are aiming for “Sub-Cycle Accuracy”: https://emulation.gametechwiki.com/index.php/Emulation_accuracy#Subcycle_accuracy

This means that multi-byte writes are correctly partitioned across cycles and partial data updates are possible, allowing for the more esoteric features of the system to be accurately emulated. This means we should be able to support interrupt hijacking and any other cases that rely heavily on the cycle timing of the system.

Additional options/features to facilitate accurate emulation:

If behavior differes from the actual hardware result, it is considered a bug. Hacks are to be avoided as much as possible.

The CPU should be completely sub–cycle-accurate, as every individual cycle is documented there. The same should apply to the PPU (questions surround PAL differences at the cycle level) and the APU.

Timing is not based off audio or monitor refresh rates as is done in many emulators. We use a real clock (with at-minimum microsecond accuracy) and match real timings to real time units, which we can speed up and slow down as options. The NTSC version’s CPU will need to pump out ~29,780.506887 cycles per frame at 60.098814 FPS, while the PAL will need to pump out ~33,247.485977 cycles at 50.006979 FPS. This means there is no noticeable visual delay (rendered frames are presented essentially immediately, rather than waiting for a monitor refresh, doing a frame’s worth of work, and then providing the visible frame after a delay) and that input is polled with exactly the same timing as in a real console, eliminating all input lag. It should both look and feel like a real console, with responsive controls that feel identical to how they do on real machines.

Performance

There were initially some concerns that being sub–cycle-accurate would mean extra overhead—other emulators may skip useless redundant opcode fetches, but not here, and each fetch is accompanied by an entire CPU tick and all the work that goes into updating the CPU state, etc. For this reason, most systems were implemented in an entirely branchless fashion—there are no “if”/“else” statements, “%” operations, “&” operations, “>=”/“<” checks, etc. when accessing memory; address mirroring, address mapping to registers, etc., is all handled entirely without branching, and most CPU, PPU, and APU cycles are branchless as well. This more-than made up for the cycle-accuracy overhead.
My custom filters and custom image-resizing routines are AVX/SSE-enhanced, and AVX/SSE is also used to put heavy work into audio processing while remaining blazingly fast. On my laptop, the authentic CRT filters with 100% clean audio can run at 90 FPS, while the L. Spiro filters can run at 120-144 FPS. Even though max settings are still able to cleanly maintain 60.098… FPS, both audio and video can be reduced in quality to run even faster. Low-power machines should have no problems running BeesNES, and this is all still running in software. GPU support will eventually make everything even faster.

Other Features

Other features will include:

Building

BeesNES does not use any 3rd-party libraries outside of OpenAL. Simply install the OpenAL SDK and BeesNES should build without a problem. Microsoft Visual Studio Community 2022 (64-bit) - Current Version 17.4.4