BinBashBanana / webretro

RetroArch in your browser
https://binbashbanana.github.io/webretro/
MIT License
264 stars 374 forks source link

Angrylion RDP N64 and general libretro builds #9

Open thelamer opened 3 years ago

thelamer commented 3 years ago

I got to about where you are building mupen64 for web. Another project you might be interested in is https://github.com/nbarkhina/N64Wasm/ they took the old parallel libretro code and used SDL openGL baked into emsdk.

My question is if it is possible to basically go back in time just forget about Glide and use CPU cycles to get something working for N64? It looks like in the most current versions everything is tied to threading and as far as I can tell with all the building I have done enabling PTHREAD in retroarch and using workers is a big overhaul to the codebase to enable shared memory arrays. I got the interface to load using proxy-to-worker, but any core launching from there will just crash.

I have not had any luck getting the parallel core building and working with angrylion RDP on my own, just wondering if it is something you have looked into? I am mostly interested to see if it produces something playable with a modern CPU despite the lack of threading.

Also unrelated to this issue, it looks like to get mobile compatibility the missing piece is overlay support in: https://github.com/libretro/RetroArch/blob/master/input/drivers/rwebinput_input.c It needs touch mapped like they do for mouse: https://emscripten.org/docs/api_reference/html5.h.html#touch https://github.com/libretro/RetroArch/blob/master/input/drivers/rwebinput_input.c#L261-L279 Then referenced properly in input_overlay. Alternatively elements can be rendered in the page separate from the canvas.

Basically I have gotten as far as I believe I am technically capable of porting cores over for self hosted web and seeing if you may have anymore insight on continuing to fork off some of these cores to something usable for web.

BinBashBanana commented 3 years ago

Sorry it took me so long to get to this issue.

My question is if it is possible to basically go back in time just forget about Glide and use CPU cycles to get something working for N64?

Possibly, but I really wouldn't want to (try to) do this.

Workers are a new concept to me, they seem like they would be useful, I can look into them.

As @ethanaobrien suggested in #6, I am going to try using the GLupeN64 core. It is a little out of date, but I should be able to update it as needed.

As for touch support, thanks for the information, I will play around with it and see how it works. However, I'm a little afraid that the RetroArch touch controls are too lacking, and the screen may be shrunk by the aspect ratio, but that can be fixed (see #7). I was considering making custom controllers in the DOM that would trigger keyboard events to simulate inputs.

I will start work on these as soon as 6.4 is released, which is soon, I promise. I want to start focusing more on getting more cores supported now.

thelamer commented 3 years ago

Here is reference code for the cores I built. https://github.com/linuxserver/libretro-cores Released as tarball here: https://github.com/linuxserver/libretrojs Outside of the data directory change for the wasm file location they should be plug and play. Source snapshot: https://github.com/thelamer/retrostash

They have the missing assets stanzas removed. I built using emsdk 1.39.5. Been working lately to get safari 15 support for iPhones/iPads without luck, the code runs, but nothing renders to canvas.

This all plugs into this: https://github.com/linuxserver/docker-emulatorjs If you want to try it.

thelamer commented 3 years ago

This is as far as I could get with Glupen: https://ipfs.infura.io/ipfs/QmfWEGQFoBxi4YQ49KCimTAK9jpgdUf1HvxgoqxjXVXMtf?filename=glupen64.zip It builds and loads, but complains about the functions defined in glsm.c and crashes:

error

Source: (built with emsdk 1.39.7 for fiber support) https://github.com/thelamer/GLupeN64

thelamer commented 3 years ago

So I did not get anywhere with Glupen, but did find a change that actually modified the output and got a bit further. I was interested if it made a difference in your mupen64 core and surprisingly it did. If you change ctx=canvas.getContext("webgl" to ctx=canvas.getContext("webgl2" in the mupen64plus_next_libretro.js file a bunch of the texture errors dissapear. IE:

Without: without

With webgl2: with

Not really an encompassing fix for anything, but maybe could lead to something.

Edit: It seems like the path should be trying to make cores built around GLES3 with webGL 2.0 as it supports many more GL calls.

BinBashBanana commented 3 years ago

I tried it out, it fixes a few errors, but not all.

I think that switching to GLES3 is a wise idea, that's what m4xw was testing. Once this next update is finished, I'll see what I can do.

ethanaobrien commented 3 years ago

@BinBashBanana I agree Another example:

webgl Screenshot 2021-12-03 9 10 06 AM

webgl 2 Screenshot 2021-12-03 9 10 44 AM

This is when only changing that one line of code

thelamer commented 2 years ago

So I guess goal achieved, I went back in time and got an angrylion emscripten build. Forked here: https://github.com/thelamer/parallel-n64/tree/angrylion

Honestly it is not far off on modern systems especially an Xbox Series console despite only having one core to work with and using old code. I have not had any luck forcing Angrylion into a 320x240 low res mode though and I feel like that may be the only path to this being playable.

PreBuilts here: https://ipfs.infura.io/ipfs/QmczcVnKehE4xfsZJ2KcGH1Q9Jz7Nq4rGa8u5p9e35ecvS?filename=parallel_n64.zip

thelamer commented 2 years ago

I got threading to work with the latest emsdk (3.1.0), this allows building against head of https://github.com/libretro/parallel-n64:

https://user-images.githubusercontent.com/1852688/147839990-2b1d8cb3-8fc5-4d17-8f6d-7971159d213c.mp4

It is pretty basic changes to the make section of the core: (also change -Ofast to -O3 and disable -fcommon)

# emscripten
else ifeq ($(platform), emscripten)
   TARGET := $(TARGET_NAME)_libretro_$(platform).bc
   GLES = 0
   HAVE_OPENGL = 0
   WITH_DYNAREC :=
   HAVE_PARALLEL = 0
   CPUFLAGS += -DNOSSE -DEMSCRIPTEN -DNO_ASM -DNO_LIBCO -s USE_ZLIB=1 -s PRECISE_F32=1
   WITH_DYNAREC =
   CC = emcc
   CXX = em++
   HAVE_NEON = 0
   PLATFORM_EXT := unix
   STATIC_LINKING = 1
   SOURCES_C += $(CORE_DIR)/src/r4300/empty_dynarec.c
   HAVE_THR_AL = 1
   CFLAGS += -pthread
   LDFLAGS += -pthread
   CXXFLAGS += -pthread

Then for Retroarch itself: (Makefile.emscripten)

MEMORY ?= 536870912
CFLAGS += -pthread
CXXFLAGS += -pthread
LIBS    := -s USE_ZLIB=1
LDFLAGS := -L. --no-heap-copy -s $(LIBS) -s TOTAL_MEMORY=$(MEMORY) -s NO_EXIT_RUNTIME=0 -s FULL_ES2=1 -s "EXTRA_EXPORTED_RUNTIME_METHODS=['callMain']" \
           -s EXPORTED_FUNCTIONS="['_main', '_malloc', '_cmd_savefiles', '_cmd_save_state', '_cmd_load_state', '_cmd_take_screenshot']" \
           --js-library emscripten/library_rwebaudio.js \
           --js-library emscripten/library_rwebcam.js \
           --js-library emscripten/library_errno_codes.js \
           -pthread -s PTHREAD_POOL_SIZE=4 

This will also produce a parallel_n64_libretro.worker.js that needs to be copied over to your project.

For this to work in chrome you also need to serve the content from a webserver with Cross-Origin-Embedder-Policy and Cross-Origin-Opener-Policy set. IE to achieve this with an express server:

app.use(function(req, res, next) {
  res.header("Cross-Origin-Embedder-Policy", "require-corp");
  res.header("Cross-Origin-Opener-Policy", "same-origin");
  next();
});

Or from an Nginx webserver:

        add_header 'Cross-Origin-Embedder-Policy' 'require-corp';
        add_header 'Cross-Origin-Opener-Policy' 'same-origin';  

I have kind of a shitty PC (i7-4790) I would expect better on more modern chips.

One thing really stands out playing with thread count, I do not think the current bottleneck is in going past 4-8 threads for angrylion. I can get more CPU usage by going to 16 32 etc, but it actually hurts performance and it looks like the limit is on the main thread (possibly). I suspect cxd4 rsp, but do not have the knowledge to really dig in past this. In general it is not there yet and not "playable", I wonder if any other tweaks could be applied to get it on par with the desktop version.

Pre-Built bins: https://ipfs.infura.io/ipfs/QmcHKT7PM2tFTyqkaZUsUJ8ATHgojLRL1zav5ibvAmB2wk?filename=multi-angry.zip

thelamer commented 2 years ago

Bumped down to 320x240 for the resolution, this is really close to being playable.

https://user-images.githubusercontent.com/1852688/147858795-4fdc885d-3fa6-497b-8225-28e629f38d00.mp4

BinBashBanana commented 2 years ago

Awesome! Sorry I haven't been keeping up with this issue. I'll try implementing ToadKing's emscripten libco fix, just to see if that works better than PTHREAD. (it probably doesn't)

thelamer commented 2 years ago

Threads opens up a bunch of options here, check out ds emulation:

https://user-images.githubusercontent.com/1852688/147860005-425405aa-3c2b-422a-aec6-1125705e6aea.mp4

https://ipfs.infura.io/ipfs/QmPCcMAR3bSyWzGBgZCJKHo78eXpMh9PFqwteC1vrUBB36?filename=melon.zip

I would bet this makes stuff like Saturn playable as well, though I can't figure out the proper symbol declaration in yabause/src/thr-rthreads.c for slock and sthread stuff when building with HAVE_PTHREADS

thelamer commented 2 years ago

As suspected Saturn runs really good with threading: (full speed on an Xbox)

https://user-images.githubusercontent.com/1852688/148463500-e2321890-1b7a-4640-988d-c3628d458f77.mp4

BinBashBanana commented 2 years ago

Hi @thelamer, sorry for the long period of inactivity, I have been working hard on v6.5 and it is finally released! It addresses several things in this issue:

  1. All of the WebGL draw errors were caused by emscripten issue 4214. Implementing one of the patches fixes the problem completely.
  2. Mupen64Plus-Next has been updated to the latest version from GitHub, and is now using GLideN64 GLES3 (WebGL 2). A different bug is still here: see #4.
  3. ParaLLEl N64 has been added with GLide64 and glN64 (both GLES2 renderers on WebGL 2, as using WebGL 2 seems to fix a minor color depth problem.) Libco/fiber has been added back as well.
  4. melonDS and Yabause have both been added without threads since github.io cannot host with special headers.
  5. Beetle PSX HW has been added, I worked on GLES 3.0 support for the HW renderer (see pull #856).

Additionally, these 5 cores have been built with autovectorization which can improve performance slightly (these flags added to the compile flags: -msimd128 -ftree-vectorize)

Threaded Angrylion is not out of the question, I'd like to get this working on the Mupen64Plus-Next core as well. Another thing - what was your solution to getting Yabause to build with threads? I couldn't figure it out.

At some point in the future I also want to port melonDS's OpenGL renderer to GLES 3.0, as well as getting YabaSanshiro (outdated saturn core with GLES 3.0 rendering) or Kronos (more updated saturn core that either requires GLES 3.0 or 3.1, I'm not sure) to run on a single thread.

Threaded OpenGL rendering may also be possible at least for Mupen64Plus-Next, m4xw said that he had had threaded GLideN64 working at some point but not anymore. He also said that he found an old mailing list saying that offscreen framebuffers only worked on emscripten's SDL backend (not on EGL) but it could be outdated.

Speaking of this, it seems I'm restricted to emscripten 2.0.15 as any higher than that prevents any core from loading. I haven't actually tested this recently because it is a pain to test, and I may have already fixed it.

I have also transitioned to a much better system for core patches, see Building from source. I tried my best to be a thorough as possible when making this, so let me know if you run into any problems.

Here's some screenshots of Mupen64Plus-Next:

Show --- ![Ocarina of Time](https://cdn.discordapp.com/attachments/832671361184825414/981718447119560834/ocarinaoftime.png) ![Paper Mario](https://cdn.discordapp.com/attachments/832671361184825414/981718447455084604/papermario.png) ![Super Mario 64](https://cdn.discordapp.com/attachments/832671361184825414/981718447761264640/supermario64.png) ![Mario Kart 64](https://cdn.discordapp.com/attachments/832671361184825414/981718448109404170/mariokart64.png) ---
thelamer commented 2 years ago

Holy cow @BinBashBanana this is amazing you are a genius. Correct me if I'm wrong here, but this should add capabilities to anything that doesn't leverage simd right? Shouldn't stuff like duckstation and PS2 be plausible now?

BinBashBanana commented 2 years ago

Probably. I don't expect particularly good performance at all, but I can try also using SIMD in cores that use SIMD instructions (emscripten support). I tried compiling GLideN64's NEON stuff in Mupen64Plus-Next, but there were some unimplemented symbols so I gave up. Citra, Dolphin, PCSX2, PPSSPP, and Flycast are the cores that I haven't built yet for this reason. (A few of them also use CMake which could also be annoying.)