libretro / beetle-pce-fast-libretro

Standalone fork of Mednafen PCE Fast to libretro
GNU General Public License v2.0
29 stars 56 forks source link

Mednafen accurate pce module #19

Closed OmegaXII closed 3 years ago

OmegaXII commented 9 years ago

It would be nice if the "accurate" Mednafen pce module was ported to libretro so it could be an option for faster hardware alongside the existing pce-fast module

eadmaster commented 9 years ago

I definitively second the request! Note that Mednafen defaults to the "accurate" core and the "fast" version is considered experimental.

ghost commented 8 years ago

I third the request times thousand!

I would rather have seen the accurate core/module being ported then the fast one and I'm all in for accuracy no matter the cost.

It would easily have been my PC-Engine emulator of choice like the Mednafen PSX libretro core is for PlayStation.

Oh well, maybe one day...

SedrynTyros commented 7 years ago

I think this is long overdue

DukeSkinny commented 7 years ago

Indeed this would be the optimal scenario, unless a technical explanation is given as to why it would be unfeasible.

sljunkie commented 6 years ago

So far mednafen's Accurate PCE core is the best PCE emulator out there so yes I fifth / sixth the request.

ToniBC commented 6 years ago

There are several who ask for the precise core, some answers were that the core fast is the same as the precise, but no, this has several failures.

Other answers in the forums were that they do not put repeated cores, but there we have those of Bsnes, Higan with different profiles.

The core fast has several problems in some games, a clear example is the intro of CD Popfull Mail, in the fast it goes bad, corrupt graphics and in the accurate works well.

One option is always to thank for improving the emulation.

ghost commented 5 years ago

without libretro able to handle *multi-width modes for each frame, then having an accurate core wouldnt make significant difference...

*(forgot the exact term but pce is able to render for example main game screen in different with than the status bar for example)

andres-asm commented 5 years ago

I figure the end result is a standard with though right?

On Sat, Apr 6, 2019 at 10:08 PM retro-wertz notifications@github.com wrote:

without libretro able to handle *multi-width modes for each frame, then having an accurate core wouldnt make significant difference...

*(forgot the exact term but pce is able to render for example main game screen in different with than the status bar for example)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libretro/beetle-pce-fast-libretro/issues/19#issuecomment-480554981, or mute the thread https://github.com/notifications/unsubscribe-auth/ABpC0JIGoE7nDBhYts0YkPjrtKvGLRLiks5veWE5gaJpZM4GM2-J .

ghost commented 5 years ago

I thought it was a pain to get it ported. Built on top of Beetle PCE Fast as it was easier than doing the brute force approach from scratch. Didn't see anything useful added up to 1.22.1 other than unwanted new api overhead.

Mednafen 0.9.48 (accurate) core https://github.com/trinemark/beetle-pce-fast-libretro/tree/pce

It lacks some (debugger code) removal cleanups, uses owl resampler (adpcm, psg), misses some options like multiple par, hq adpcm, disable multitap, dshadoff cdrom fixes (https://github.com/dshadoff/mednafen-debug), correct 565 color expansion, framebuffer reduction, and other misc stuff I wanted to do.

Tested on several games:

No plans to work on this for awhile. If Team Libretro takes over core ownership and future development, that'd be more great.

hizzlekizzle commented 5 years ago

@trinemark well that's awesome. Thanks for this!

Seems there's a bounty for that. Are you interested in collecting on it?

Tatsuya79 commented 5 years ago

I fail to compile it with MSYS2 gcc7.3 on win7 x64:

In file included from mednafen/pce/huc.cpp:28:0:
mednafen/pce/mcgenjin.h:169:7: error: 'unique_ptr' in namespace 'std' does not name a template type
  std::unique_ptr<MCGenjin_CS_Device> cs[2];
       ^~~~~~~~~~
mednafen/pce/mcgenjin.h: In member function 'uint8 MCGenjin::ReadTP(int32, uint32)':
mednafen/pce/mcgenjin.h:89:18: error: 'cs' was not declared in this scope
    case 2: ret = cs[0]->Read(timestamp, A & 0x3FFFF); break;
                  ^~
mednafen/pce/mcgenjin.h: In member function 'void MCGenjin::WriteTP(int32, uint32, uint8)':
mednafen/pce/mcgenjin.h:136:19: error: 'cs' was not declared in this scope
    case 2: return cs[0]->Write(timestamp, A & 0x3FFFF, V);
                   ^~
mednafen/pce/mcgenjin.h:136:57: error: return-statement with a value, in function returning 'void' [-fpermissive]
    case 2: return cs[0]->Write(timestamp, A & 0x3FFFF, V);
                                                         ^
mednafen/pce/mcgenjin.h:137:57: error: return-statement with a value, in function returning 'void' [-fpermissive]
    case 3: return cs[1]->Write(timestamp, A & 0x3FFFF, V);
                                                         ^
make: *** [Makefile:600: mednafen/pce/huc.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from ./mednafen/hw_video/huc6270/vdc.h:4:0,
                 from mednafen/pce/vce.h:22,
                 from libretro.cpp:17:
./mednafen/lepacker.h: In member function 'void MDFN::LEPacker::operator^(T&)':
./mednafen/lepacker.h:51:16: error: 'out_of_range' is not a member of 'std'
     throw(std::out_of_range("LEPacker::operator^"));
                ^~~~~~~~~~~~
make: *** [Makefile:600: libretro.o] Error 1

Is this the SNES timing?

ghost commented 5 years ago

@Tatsuya79 I was fishing around on how to compute the exact system fps using snes timing notes but forgot to replace the numbers. Don't have the exact number yet so I'll revert that back to what it was ~59.98.

unique_ptr = c++14. I was aiming to remove all c++11 and beyond stuff, including that throw. MSVC 2017 auto-compiles with c++14 or c++17 I guess.

Thanks. I'll get up and start working on this now. Maybe get some more stuff done first before hanging it up again.

@hizzlekizzle If the bounty owners wish to donate directly to libretro or any emu author, that'd be preferred. I myself won't be collecting on it.

ghost commented 5 years ago

@Tatsuya79 Try new commit. Should build on Linux too. https://github.com/trinemark/beetle-pce-fast-libretro/tree/pce

Will add more commits as I get to them.

Tatsuya79 commented 5 years ago

Thanks it compiles fine now! :+1: :smiley:

Tatsuya79 commented 5 years ago

It works quite decently already, some issues I noticed: -PSG volume not working -horizontal overscan not working (games like ninja spirit stay at 341 pixels width) -overclock not working (sprite limit too?) -some garbage in the overscan on bottom (especially noticeable in vertical shmups, when setting max 242px, or redraws in Sylphia) -order of the gryphon is setting the picture to 341 width, so top picture is squished (noticeable on fonts during dialogues) -sound clicks at the end of a sample (like the upgraded sword hit effect in ninja spirit)

ghost commented 5 years ago

Do you have a small list where to check overclock, sprite limit, psg volume, etc? Testing multitap disable and 12-bit adpcm. Have to check Mednafen options and compare to Beetle core.

Tatsuya79 commented 5 years ago

PSG volume (only for cd games), I go into Sylphia sound test. But any CD game basic sound effects are PSG. You can test with pce_fast/sgx 1st to make sure.

For overclock I usually test with ninja spirit. Stage 1 end boss when ninjas jump from both sides and when the boss comes out of the ground it slows down a lot when you've got 2 "shadow ninjas" with you and upgraded sword.

For Sprite limit I've just tried with Darius Alpha and I can see the 1st boss stops flickering after a reboot. So it's working fine.

For the overscan range we changed some stuff in pce_fast to display the full range and have PAR by default, centred picture and width settings for games in 341 mode (ninja spirit, aoi blink, r-type (japan only)...). But those changes can break some things (like positioning of small pics on screen in Final Blaster intro, that's why we added a hack for that).

ghost commented 5 years ago

Thanks for explaining. H-Overscan is now hooked up.

Although now I understand. 341 is the supposed traditional / nominal width (per Mednafen docs). But you can adjust up (352) and down (300) the overscan clipping range. I might revisit this again later for more alterations. I can imagine some people wanting 1:1 PAR option, whereas core currently stretches to locked 341 width.

Tatsuya79 commented 5 years ago

H-overscan is just cutting in the right side, left is always the same.

And it should probably be split in left and right overscan now as the picture will be uncentred. (if we keep it that way to not break the "horizontal timing registers positioning", yeah I'm borrowing vocabulary from here :sweat_smile: )

hizzlekizzle commented 5 years ago

Re: bounty, I believe we can transfer it over to another existing bounty without too much trouble.

ghost commented 5 years ago

@Tatsuya79 I think overclocking (cpu, cd) are now fixed. Sprite limit can be done in real-time. 5-player Bomberman works. Tested CD PSG slider and volume goes up / down.

Ninja Spirit sound clicks. Can you make a Retroarch audio recording? I can't hear this. Sounds (about the) same to me with Beetle PCE Fast so I'm missing something.

https://pineight.com/mw/index.php?title=Dot_clock_rates 8:7 (256), 6:7 (352), 4:7 (512).

So this says 8/6 = 4/3 pixel stretch. Which is where 341 - 341.3333333 - 342 cropping comes from. But it can overscan to 352 but keeps same par. And our dar would get thrown off. Plus TVs who can show / hide the extra overscan pixels so left - right crop needed. Needs more thought; cpu overclocking required fractional cycle counting for pce accurate to avoid event de-syncing.

Haven't checked 240+ overscan area yet.

edit: Please retry Order of Griffon dialog 341 text again.

Tatsuya79 commented 5 years ago

Overclocking is working now nice!.

Order of the Griffon is still wrong. What happens is the top of the picture showing the main game in pseudo 3D is 256px width while the bottom with the characters portraits and stats is using the 341 pixels mode. It's changing the resolution middle frame and I think the way mednafen does it is by rendering at a higher horizontal res? (PR adding a hack in pce_fast with a link to another method which I'm really not sure can work here?) edit: I just tried it and it just shows a part of the screen zoomed in. It's bad.

About the sound clicks, it's in every game more or less, but I found a good example using Cyber Core sound test with SE 09:

cyber core samples.zip

ghost commented 5 years ago

Mednafen 1.22.1. Order of Griffin looks roughly the same as Beetle PCE atm. 256 view stretching. Or is there a hidden option? Using vanilla defaults.

I do see what Beetle PCE Fast does with the pillarbox. It probably can be done since it should tie into 352 cropping.

But does anyone have a screenshot of real hardware + CRT TV?

edit: No pillarbox like PCE Fast. https://www.mobygames.com/game/turbo-grafx/order-of-the-griffon/cover-art/gameCoverId,96568/

edit2: Clicky: I'm betting it's Mednafen's Owl resampler. It has that sharp, blocky, crisp edge-y sound. Old core uses Blip and has gentler falloff and rises.

Tatsuya79 commented 5 years ago

You can see some capture videos on youtube here or here. It's like mednafen stand-alone for the aspect ratio of each part of the screen, the difference is that if you render it in 342 pixels width the 256 width part on top will have noticeable uneven pixels. A crt doesn't work the same and won't do that.

But here with RA we have to decide of a way to handle that and specify a width. If it's 2048 or something it will be close to an integer value for both parts so that it's not too visible. Then it's about what's acceptable or not, I'm not sure what mednafen chooses, 1365 pixels I saw somewhere?

Some other games do that mid-frame res change like Asuka 120% in its cutscenes.

ghost commented 5 years ago

Uneven hard pixels. Large width for nice integer scaling. Okay, I'm with you now.

Mednafen has a 1365 width buffer. There's PCE TA_AwesomeMode which expands everything to 1024 using real-time dot clock drawing, but it creates glitchy scanlines towards the bottom.

I will create a fork for this and give it some more thought.

ghost commented 5 years ago

1365 dot clocks per scanline. But that's a messy number to work with.

Mednafen (internally) draws a nominal 1024 due to scaling math:

256 512 768 1024 1280 1536 1792
341   682   1023   1364  1705
342   684   1026   1368  1710

I'd agree that 342 is the nominal width. Games seem to draw 342 pixels over 341. Although this may matter less once adjustable cropping kicks in.

Might also try to include a core option to render like PCE Fast with pillarbox.

I'll work on this first and post a test branch.

Tatsuya79 commented 5 years ago

I found a good example of the bad refresh in the vertical overscan, it's easy to see with PC Genjin if you display 242 lines on the bottom of the screen. I don't think mednafen goes as far upstream and we changed that to get the full signal (as it was displayed on Turbo GT and because many games have valid signal there so why not).

Some games had their picture not centred and cut too before we changed it like Super Darius (CD game). Super Darius (J)-190417-174649

Tatsuya79 commented 5 years ago

Ah I found what's causing the audio clicks. There's a setting called pce.psgrevision that is probably forcing the SGX audio chip and causing clicking as explained in this page.

I tried to force "return 1" on this line and the clicking is gone. I'm not sure what I'm sending back with that though...

That's a setting that wasn't in pce_fast. There's pce.resamp_quality too and perhaps some more.

ghost commented 5 years ago

Thanks for finding that. Although now I'm wondering a bit before I get a chance to test.

    enum
    {
        REVISION_HUC6280 = 0,
        REVISION_HUC6280A,
        _REVISION_COUNT
    };

By default, it's _REVISION_COUNT which auto-selects.

psgrevision = IsSGX ? PCE_PSG::REVISION_HUC6280A : PCE_PSG::REVISION_HUC6280;

A value of 1 would always force 6280A (or at least it's supposed to do that). I'll debug check this to be sure.

Reminds me of the two MT-32 revisions; some old Sierra games relied on Rev 0 for its sfx.

edit:

Will expose this in the options. I guess it'll start at "auto" for default hardware accuracy, and let user override if they want the chip with audio fix.

edit2: I'll likely expose the Owl resampler quality also. And maybe the Arcade Card switch if someone wants to play the non-enhanced version.

edit3: Feel stupid not checking the SuperGrafx core. libretro.cpp has some useful upgrades over PCE Fast core to backport.

Tatsuya79 commented 5 years ago

Well, something particular vs pce_fast once again is the pce.h_overscan setting that will enlarge the width following those values I assume. When enabled I can finally see the Ninja spirit logo completely and Super Darius won't cut the right of the picture anymore (but the image isn't centred). (while you can show the full range on original machines, I checked that again for Super Darius: https://www.youtube.com/watch?v=0CzO062iIt4 https://www.youtube.com/watch?v=wrEk3SybZGw and for Ninja Spirit: https://youtu.be/hl0IG8QAnrc?t=261 https://www.youtube.com/watch?v=hNWyydIglHA )

It messes up the image ratio though and the picture can be off-centred for the 256 width mode too (as in super darius). So I'm not sure we want to break mednafen rendering too much (as in changing what is defined for different width modes in vce.cpp, that's the "accuracy" version of the core here) or add many "crop left, crop right" options for each resolution...

So I wonder, the easiest solution could be: -exposing pce.h_overscan as an option and try to adapt the screen ratio when enabled -have another option called "Auto centre the picture (hack)" that would do something similar to pce_fast and force 256/352/512 with centred pic -have the same horizontal cropping option as pce_fast that cut both sides then

mmm writing that down I'm not even sure that's the best way and can be easily done lol. But I post it anyway for discussion.

edit: perhaps just pce.h_overscan and independent crop left and right that would work for every mode. It will be a manual work with game overrides for people that want the full range centred then, but that's not for so many games.

Tatsuya79 commented 5 years ago

About the bottom lines refresh problem when showing 242, it can be fixed by changing every 240 mention by 243 in vce.cpp (there's 3 spots). Yeah I know, after advocating for not changing values too much I suggest that. :sweat_smile: But I don't know why mednafen is limiting to 239 while 242 are valid in most games and displayed on Pc Engine GT.

Not sure about that comment here 263-14=249. (I blindly tried that but that crashed)

ghost commented 5 years ago

I'll just make a "Vertical overscan scanlines" = 240, 243 option. I'm also going to eventually put back the 341 numbers and work with expanded overscan math, which is going to be messy math for me.

Thanks for all the help btw. Making this unfamiliar stuff easier to handle.

edit: You'd have to bring it up with Mednafen author about 240 vs 243+. But another good find, easily overlooked if you don't know what's going on.


max_T<uint32>(240, MDFN_GetSettingUI("pce.slend") + 1)
ghost commented 5 years ago

https://forum.fobby.net/index.php?t=msg&goto=675

Once the screen width has been set, then the VDC's registers like HDR, HSR, etc. are set up to center the display. For example some games use the 352-pixel mode and center a 320-pixel screen in the middle.

So by default we're supposed to auto-center the picture somehow, and take into account any h-overscan. That could be brutal.

I'd vote for an "auto" type of horizontal overscan detection. Although this is getting stickier than I wanted.

edit: perhaps just pce.h_overscan and independent crop left and right that would work for every mode.

This'd be good too as a fallback, if the math magic starts to fail us. But this is all cloud thinking until someone pushes something through.

I think Asuka 120% uses 256 - 512 (no 341) multires, so I'll account for this also.

Tatsuya79 commented 5 years ago

I already brought the overscan shortcomings to mednafen author some years ago but it's probably not a priority for her. I can't blame her as she made a great Saturn emulator meanwhile, these kind of things happen.

Everything is rendered at a really large width right now, it's just temporary right?

ghost commented 5 years ago

The full accurate mode (hires dot clock renderer) has to be 1024 + 96 overscan. It has some nice simplicity to it since it can handle the extra crazy pixels: tomaitheous says lores = 274, midres = 368 (Tenchi o Kurau), hires = 548. Even 256px in midres mode for vertical shmups. And cleans up any multires overscan scaling problems. --- edit: I assume someone would want to keep this, so maybe have to option it also ---

The overscan stuff needs a working framework. I am writing an idea that might work much of the time for horizontal auto-cropping and centering mode. Then if we can get away with it, I'd like to reduce the buffer again. Even if it means de-scaling the data back down when we're not in midres multi-res mode (maybe just 1 game). --- edit: Damn overscan pixels. At least they work in blocks of 8 pixels. Let user crop past that if needed. ---

But well. I'd never expect the PC Engine to be so flexible for a 6502-era system. Beats the Atari 2600 multi-fps switching modes. If I worked even half a decade on this system, I'd get tired and drift elsewhere.

ghost commented 5 years ago

Mednafen uses some slightly non-standard overscan and resolution widths. Working out the math slowly. Super Darius is 256 cropped and centered. Final Blaster looks almost okay - need an adjustment somewhere. 352 mode is not so nice yet.

Also fixed par math. (float) width / (float) height. Else compiler goes strict with (int) width / (int) height * (float) par = off by enough pixels.

edit: Okay. Mednafen chose specific values for clean dot clock scaling. 256 and 512 have been nearly math'd out. 352 is yuck.

ghost commented 5 years ago

Auto-crop horizontal overscan is now in. This will need lots of testing. Note that some games seem to enable overdraw with either blank dummy or garbage tiles.

I can't blindly clip this out and I suppose we'll need another h-crop flag: 8px left, 8px right, 16px both. But you're the game experts and know what extra clipping we need, whether per line or such.

For Order of the Griffin, I'm unsure whether players want the current 320 auto-crop look or the standard 341 view. Or toggle between them somehow.

Since it works well enough so far and passes 240p test suite, I'll look into providing lower res scaling options.

Tatsuya79 commented 5 years ago

There is this issue on compilation:

libretro.cpp: In function 'void update_input()':
libretro.cpp:877:27: error: 'roundf' was not declared in this scope
    mousedata[j][0] = (int)roundf(_x * mouse_sensitivity);
                           ^~~~~~
libretro.cpp:877:27: note: suggested alternative: 'rand'
    mousedata[j][0] = (int)roundf(_x * mouse_sensitivity);
                           ^~~~~~
                           rand

Also it's missing the crop overscan core option. Ah ok it's an additional step in the h_overscan one.

ghost commented 5 years ago

#include <math.h>

Also realized that Addams Family and Asuka 120% might be rendering past the active draw area. If so, I can auto-crop that 8px garbage out.

Tatsuya79 commented 5 years ago

352 wide games have a slight rounding problem that removes 1 pixel on the left side, 1 black on the right side. Saigo no Nindou - Ninja Spirit (Japan)-190421-113820 R-Type Part-2 (Japan)-190421-181043

Addams family too is cut slightly on the left and shows a bit of garbage on the right: The Addams Family (U)-190421-184008

Asuka 120 looks fine except on this screen when starting the story mode (this problem also happens in pce fast, we didn't bother with it): Asuka 120% Maxima Burning Fest (J)-190421-181513 Asuka 120% Maxima Burning Fest (J)-190421-181512

Tv Sport Basketball 512 wide picture is cropped too much: TV Sports Basketball (Japan)-190421-183456 TV Sports Basketball (Japan)-190421-183510

Same for Flappy Bird: (that's a special case) flappy_bird_SGX-190421-183602 flappy_bird_SGX-190421-183614

Super Darius is fine. Order of the Griffon too, apart for that slight rounding issue again in that menu: Order of the Griffon (USA)-190421-185306

I've compared 1056 vs 352 width mode on an older core, 352 is 20% faster.

ghost commented 5 years ago

512 off-centered was due to overdraw crop detection. That should be fixed. Auto-res scaler is now in, although Asuka multi-res dialog is still (256, 512 => 1024) hires atm.

Dot clock renderer must be doing that rounding. I'll +1 it somewhere.

Thanks. Please check git again.

edit: Flappy Bird will need some work. Wonder what that's doing.

Tatsuya79 commented 5 years ago

Tv Sport and Flappy bird are fine now. :+1:

I tried low res mode but that's what I feared, 400fps vs 500fps before when we didn't have the big framebuffer + downscaling operations. :confused:

ghost commented 5 years ago

Asuka 120% dialogue box looks okay on my end now.

Mednafen core wants us to pre-select renderer at compile time. I'll try adding a run-time switch and doing the multi-res myself.

ghost commented 5 years ago

Lores scaler is now in. Multi-res (auto, lores) is broken.

edit: auto multi-res

lores multi-res

That's the current schedule. After that, I hope we're nearly over with the technical hardware stuff.

Tatsuya79 commented 5 years ago

Sorry to tell you, but it's slower now, around 350fps on my same test on Ninja Spirit (auto/auto and lores/h overscan off giving the same speed). :confused: Too much overhead with conditions in intensive spots?

For lores 256+352 it's mainly order of the griffon where some characters stats will be cut on each sides if you crop to 256. Better use 352.

ghost commented 5 years ago

It should be better now but maybe some bit slower still. Might look for some tuneups.

For Order of Griffin (lores mode), would everyone prefer 256 -> 320 dungeon view or centered 256 view? Game only draws 320 pixels for status bar, which is what we auto-crop to.

Tatsuya79 commented 5 years ago

Centred 256 I'd say. Uneven pixels are too noticeable if stretched to such a low resolution.

No difference speed wise, still 350fps at max speed. I tried to remove a lot of conditions in vce.cpp but couldn't get it faster I wonder what it is...

Tatsuya79 commented 5 years ago

I went back to make sure at which point we lost 20% perf and it's in that commit. (with that fix too)

ghost commented 5 years ago

Test1 is: https://github.com/trinemark/beetle-pce-fast-libretro/blob/pce/mednafen/pce/vce.cpp#L608 https://github.com/trinemark/beetle-pce-fast-libretro/blob/pce/mednafen/pce/vce.cpp#L610

SyncSub<true, false>(clocks)
SyncSub<false, false>(clocks)

Test 2 is: https://github.com/trinemark/beetle-pce-fast-libretro/blob/pce/libretro.cpp#L35 https://github.com/trinemark/beetle-pce-fast-libretro/blob/pce/libretro.cpp#L36

#define MEDNAFEN_CORE_GEOMETRY_MAX_W 1024 + 96
#define MEDNAFEN_CORE_GEOMETRY_MAX_H 243

I don't see any changes with either. You might want to turn on the debug logger and see how the resolution changes are being reported. They should be nice and small at non-hires.

edit: Also try https://github.com/trinemark/beetle-pce-fast-libretro/blob/pce/mednafen/pce/vce.cpp#L355

void __attribute__((always_inline)) VCE::SyncSub(int32 clocks)

to see if gcc is not taking the hint seriously.

Tatsuya79 commented 5 years ago

Still around 350 fps for all 3 tests.

I tried removing, replacing some parts but I can't go back to 500fps like it was prior to https://github.com/trinemark/beetle-pce-fast-libretro/commit/75f28514b99b2ae25b4bbaa7ff75093423fb1868 .

If you've got the current code compiled with msvc please share and I can see how it behaves.