dirkwhoffmann / vAmiga

vAmiga is a user-friendly Amiga 500, 1000, 2000 emulator for macOS
https://dirkwhoffmann.github.io/vAmiga
Other
306 stars 25 forks source link

Oxyron - Planet Rocklobster Landscape looks incorrect #384

Closed emoon closed 4 years ago

emoon commented 4 years ago

http://www.oxyron.de/storage/planet-rocklobster.zip

vAmiga 0.9.9

000

Expected result

Screenshot 2020-07-21 at 16 25 07

Details on how the code works


Voxel:

The part you all love so much.
Basically its just what it looks like. A voxel in HAM6.
The screenmode is pretty much what everyone was doing back in the 90´s on AGA-machines. Only with 4 instead of 6 colorbits per channel.
The algorithm is based on a raytable with 64 non-linear depths.
It renders front to back setting visible border-pixels eor-encoded relative to the last set pixel.
So a top-down blitter eor-fill-pass generates the final image.
After that 2 passes of C2P merge are done also with the blitter to distribute the pixel data into the 4 color bitplanes.
The texture is stored in 128 colors and gets extended to 4096 colors due to fogging via a shadetable.
The most tricky part is, that displaying HAM6 usually nearly stops your CPU & blitter.
So I use 2 tricks to reduce the DMA-stress.
First the HAM6 runs in an illegal screenmode. Turning on 7 bitplanes on OCS results in displaying 6 bitplanes, which is fine for HAM6.
But only 4 of them are fetched by the DMA. The other 2 planes (The HAM logic planes) can be set as constants via the bitplane DMA registers.
This is also fine for this "hi-color" chunky-mode. Because the logic planes are completely static here (RGBB RGBB).
The other trick to reduce DMA-stress is to turn off the screen every second rasterline.
Both together give a speedup of about 40%.
Last thing to mention is the hiding of the HAM6 color-bleed with sprites.

Code of the demo can be found here https://github.com/AxisOxy/Planet-Rocklobster/

dirkwhoffmann commented 4 years ago

The other 2 planes (The HAM logic planes) can be set as constants via the bitplane DMA registers.

This is the yet missing part, I think.

mithrendal commented 4 years ago

The most tricky part is, that displaying HAM6 usually nearly stops your CPU & blitter.

😂... the prove : that Amiga Graphics were ahead of its time back then ....

emoon commented 4 years ago

I actually use this trick in Eon also (for the rubbercube) but I don't use HAM6 and only set constant values to one of the bitplanes (4 if counting from 0)

emoon commented 4 years ago

Yeah 6 bitplanes really eats lots of DMA time. For example if you use dual playfield and 6 bitplanes a lot of time is spent on just fetching data for the bitplanes. Here is how it looks in a DMA view (dark blue is bitplane DMA) http://tbl.nu/assets/space_intro_dma.png I had to shrink the screen a bit extra to still be able to run the effect in full framerate.

mithrendal commented 4 years ago

Oh yes I see you have to do nearly all computation outside the 456 box if possible then ...

emoon commented 4 years ago

Pretty much. There is some spare time, but not a lot.

dirkwhoffmann commented 4 years ago

The other 2 planes (The HAM logic planes) can be set as constants via the bitplane DMA registers.

This is the yet missing part, I think.

Hmm, not so easy. I've written a simple test case which looks pretty much OK. vAmiga does take care of the values in BPL5DAT and BPL6DAT. It must be something else.

Bildschirmfoto 2020-07-21 um 17 16 52 Bildschirmfoto 2020-07-21 um 17 18 12
emoon commented 4 years ago

Maybe it's some HAM6 related thing that? HAM isn't used that often

dirkwhoffmann commented 4 years ago

Maybe it's some HAM6 related thing that? HAM isn't used that often

Yes, I didn't enable HAM mode in my test case. 🙈

dirkwhoffmann commented 4 years ago

The most tricky part is, that displaying HAM6 usually nearly stops your CPU & blitter.

Indeed, the Bitter runs like crazy here: 😳

Bildschirmfoto 2020-07-21 um 17 51 50

Here is a part of the Copper list in the faulty region. Every second line is black. The others have 7 bitplanes enabled (invalid value) and the HAM bit set.

Bildschirmfoto 2020-07-21 um 17 53 33
emoon commented 4 years ago

That copperlist makes sense for what it's doing. So I guess it's something with the HAM "combine" that doesn't work correctly? As it's only red being shown I would guess it's something like that.

dirkwhoffmann commented 4 years ago

Now, I have a real HAM mode test case (Denise/BPLCON0/ham1), but everything looks all right:

ham1_uae Bildschirmfoto 2020-07-21 um 21 23 36

The error is likely to be somewhere else, maybe caused by the code filling BPL5DAT and BPL6DAT.

emoon commented 4 years ago

Interesting. So BPL(5/6)DAT isn't set by the copper?

emoon commented 4 years ago

Is the BPL(7/8)PTH write here causing any issues?

Screenshot 2020-07-21 at 21 49 18
dirkwhoffmann commented 4 years ago

Copper list is really simple. In most lines, it just toggles between 0 bitplanes and 7 bitplanes HAM mode:

Bildschirmfoto 2020-07-21 um 21 51 27

Is the BPL(7/8)PTH write here causing any issues?

No, vAmiga ignores the writes (as all OCS / ECS Amigas do).

    case 0x118 >> 1: // BPL5DAT
        denise.pokeBPLxDAT<4>(value); return;
    case 0x11A >> 1: // BPL6DAT
        denise.pokeBPLxDAT<5>(value); return;
    case 0x11C >> 1: // Unused
    case 0x11E >> 1: // Unused
        break;
    case 0x120 >> 1: // SPR0PTH
        agnus.pokeSPRxPTH<0>(value); return;
mithrendal commented 4 years ago

I can see the graphics is there ... only in a reddish color tone ... is the issue maybe only in vAmigas HAM color computation which got tricked here ?

emoon commented 4 years ago

Might be a blitter issue also as the blitter is used to convert the graphics to planar mode.

dirkwhoffmann commented 4 years ago

is the issue maybe only in vAmigas HAM color computation which got tricked here ?

No, this is due to HAM mode. If you look at the test case picture, you see similar colored stripes (blue and green)

Might be a blitter issue also as the blitter is used to convert the graphics to planar mode.

In the next step, I am going to write a test case that runs the Blitter for all minterm combinations. I would be surprised if the Blitter still has (functional) bugs, but something fundamental must be wrong... Fortunately, this test isn't difficult. vAmiga can compute hash values for all Blitter operations and I can compare those values against a patched version of SAE where I inserted the same hash computation code.

dirkwhoffmann commented 4 years ago

I know a little bit more now. The demo writes constant values to BPL5DAT and BPL6DAT in each (non-black) row:

[3916] (  1, 20) C19DFE  0 BCBSDA 6010 17F2 [03C5C4] Denise: pokeBPL5DAT(7777)
[3916] (  1, 24) C19DFE  3 BCBSDA 6010 17F2 [03C5C8] Denise: pokeBPL6DAT(CCCC)

This results in the following repeating pattern in these bitplanes:

BPL 5: -xxx-xxx-xxx-xxx
BPL 6: xx--xx--xx--xx--

In HAM mode, this pattern means:

In vAmiga, the colorizer does exactly that (each color modification is repeated twice, because we are drawing in lores mode):

[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify red 2
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify red 2
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify green 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify green 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify red 5
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify red 5
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify green 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify green 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0
[3879] (224,  0) C19ED4  0 BCBSDA 6010 17E2 PixelEngine: Modify blue 0

Bottom line: The colorizer works as expected. The problem seems to be due to wrong bitmap data coming in. Whenever green or blue are modified, we see zeroes all over the place.

I've also checked the Blitter in the meantime. I've run the turbo Blitter with all possible minterm and channel combinations and compared checksums agains SAE. Everything looked good, so I think it's not a problem of the inner working of the Blitter.

I am also able to compute Copper checksums, and comparing those seems to be the next natural step. A checksum is computed for each frame and depends on the the executed program instructions. If the Copper stops early, for example, checksums would no longer match.

emoon commented 4 years ago

Not sure how helpful it is but here is the code for the part of the demo https://github.com/AxisOxy/Planet-Rocklobster/blob/master/demo/voxel/voxel.asm

dirkwhoffmann commented 4 years ago

My (not so) new debug option

static const int XFILES = 1; // Report paranormal activity

reported this:

Blitter: BLTPRI changed while Blitter is running

Needs investigation 👽

emoon commented 4 years ago

Oh, what happens when that is changed? Will blitter priority change of the already running blit?

dirkwhoffmann commented 4 years ago

Will blitter priority change of the already running blit?

I don't know yet, but my guess is "yes". I'll write a test case for that...

dirkwhoffmann commented 4 years ago

As expected, changing BLTPRI affects a running blit immediately. Unfortunately, vAmiga does it right 😕.

emoon commented 4 years ago

hum... tricky

emoon commented 4 years ago

Not sure how hard it would be but if you could dump out the bitplane data from another emulator that work (just as binary data) and then hack them into vAmiga when the part of the demo is running and see if it looks correct. If it looks correct then there is something wrong with the blits (likely) if it looks incorrect there is likely something odd with the HAM setup still

dirkwhoffmann commented 4 years ago

I post this here for reference:

Bildschirmfoto 2020-10-11 um 18 22 04

This file diff compares a debug log of FSUAE with a debug log of vAmiga. The Blitter checksums are different, because different values were read from memory (internal Blitter operations are OK). I had the impression that the difference in memory is due to race condition between the CPU and one of the previous Blitter operations (both the CPU and the Blitter sometimes access the same memory area in this Demo). However, I wasn't able to further track it down. At some point, I was giving up.

emoon commented 4 years ago

I will try to take a look at this one

emoon commented 4 years ago

I have investigated this one a bit today: What I have done:

  1. Compiled the effect used above (as the demo is open source) so I can change the code.
  2. Got effect running (runs as expected in FS-UAE (looks correct) and looks as above in vAmiga)
  3. Now in FS-UAE I dumped the bitplane data, and then removed all the rendering, blitter uses etc from the code.
  4. Instead of the render screenbuffer I now included the binary data of the the captured screen data. So the only thing the effect now does is to display the buffers (which is known to be correct as it looks correct in FS-UAE)

000

001

So it still looks incorrect with no rendering going on. So this all points to that vAmiga doesn't display this graphics mode correct

dirkwhoffmann commented 4 years ago

Excellent work!!!

Does your test case exist as an ADF or EXE? In this case, I could work with it instead of the original demo.

emoon commented 4 years ago

Thanks :) It's an exe but as vAmiga runs exes now it's pretty easy to work with https://www.dropbox.com/s/12z8pd71t2rk25c/voxel.zip?dl=1

dirkwhoffmann commented 4 years ago

https://www.dropbox.com/s/12z8pd71t2rk25c/voxel.zip?dl=1

Thanks, downloaded it 😎. I think it was a good decision to support EXE files. I'm still not done with the OFS/FFS file system class though. I'd like to add the possibility to create file system out of an existing ADF (right now, only the other direction is possible). Once this is done, it will be rather easy to share files between the Mac and the Amiga (no ADF tools will be needed any more).

emoon commented 4 years ago

So with this change

void
PixelEngine::colorizeHAM(u32 *dst, int from, int to, u16& ham)
{
    u8 *bbuf = denise.bBuffer;
    u8 *ibuf = denise.iBuffer;
    u8 *mbuf = denise.mBuffer;

    for (int i = from; i < to; i++) {
        // u8 index = ibuf[i];   <-- old code
        u8 index = bbuf[i];    // <-- new code
        assert(isRgbaIndex(index));

        // switch ((index >> 4) & 0b11) {
        switch ((bbuf[i] >> 4) & 0b11) {

            case 0b00: // Get color from register

                ham = colreg[index];
                break;
        ...

In now looks like this :)

000

emoon commented 4 years ago

This change also fixes #438

dirkwhoffmann commented 4 years ago

Awesome!!

Here is the relation between the bBuffer and the iBuffer (from Denise.h):

   /* Four important buffers are involved in the generation of pixel data:
     *
     * bBuffer: The bitplane data buffer
     *
     * While emulating the DMA cycles of a single rasterline, Denise writes
     * the fetched bitplane data into this buffer. It contains the raw
     * bitplane bits coming out the 6 serial shift registers.
     *
     * iBuffer: The color index buffer
     *
     * At the end of each rasterline, Denise translates the fetched bitplane
     * data to color register indices. In single-playfield mode, this is a
     * one-to-one-mapping. In dual-playfield mode, the bitplane data has to
     * be split into two color indices. Only one of them is kept depending on
     * the playfield priority bit.

The HAM drawing code is supposed to work on the iBuffer which means the error is likely in

void Denise::translate()

I didn't look at the details yet, but the translate function must mess up the data if the invalid HAM mode is selected. Your fix proves that the translate function simply has to bypass the data in the case.

It's a great day for vAmiga!!! 🥳

emoon commented 4 years ago

Cool :) yeah, I read the above comment in the code (thanks for this kind of code documentation, it's very helpful when coming new to the codebase) Hopefully it should be easy to do a proper fix now :)

emoon commented 4 years ago

Likely caused by this https://github.com/dirkwhoffmann/vAmiga/blob/master/Emulator/Denise/Denise.cpp#L522 as you suggested.

dirkwhoffmann commented 4 years ago

Likely caused by this https://github.com/dirkwhoffmann/vAmiga/blob/master/Emulator/Denise/Denise.cpp#L522 as you suggested.

Yes, this is likely it! Invalid values of prio2 might interfere with the HAM bit. I'm pretty sure this case isn't covered by my test cases. I'll check the test cases and add a missing test for it. After that, we'll know for sure.

mithrendal commented 4 years ago
// u8 index = ibuf[i];   <-- old code
u8 index = bbuf[i];    // <-- new code

hey @emoon so great that you did find this !!! 🤗

ohh see ... some paparazzi took a picture of you ... while you tracked down the issue ....

grafik

emoon commented 4 years ago

haha! thanks :)

dirkwhoffmann commented 4 years ago

After @emoon has successfully isolated the erroneous code, it’s time to think about the correct fix. To recall: The error is caused by this code:

void
Denise::translateSPF(int from, int to)
{
       …
    // The unusual case: prio2 is invalid
    } else {
        for (int i = from; i < to; i++) {

             u8 s = bBuffer[i];

             assert(PixelEngine::isRgbaIndex(s));
             iBuffer[i] = mBuffer[i] = (s & 16) ? 16 : s;
             zBuffer[i] = 0;
         }
    }
}

To break things down, I’ve written a few more test cases. Here is test Denise/BPLCON0/modes0e:

modes0e_A500_8A

This is how it looks in vAmiga:

modes0e_1

It’s important to note that this test neither enables HAM nor dual-playfield mode which shows that there is another bug, independent of the one we're trying to fix. To make sure that we are not completely on the wrong path with the „unusual case“, this image shows the result if the „unusual case“ is treated the same way as the standard case.

Bildschirmfoto 2020-11-14 um 19 19 28

This pretty much proves that the original code is not that bad.

Actually, the test picture can be replicated by slightly modifying the existing code:

void
Denise::translateSPF(int from, int to)
{
    ...
    // The unusual case: prio2 is invalid
    } else {
        for (int i = from; i < to; i++) {

             u8 s = bBuffer[i];

             assert(PixelEngine::isRgbaIndex(s));
             // iBuffer[i] = mBuffer[i] = (s & 16) ? 16 : s;
             iBuffer[i] = mBuffer[i] = (s & 0x10) ? (s & 0x30) : s;
             zBuffer[i] = 0;
         }
    }
}
Bildschirmfoto 2020-11-14 um 19 20 23

What’s the difference? If a bit in bitplane 5 was set, the old code elimiated all bitplane data except the one in plane 5. The new code eliminates 1 to 4, only.

The Rocklobster demo shows that this kind of bitplane elimination does not happen in HAM mode. Now, here is the thing: HAM mode is taken care of very late in the graphics pipeline which means that it is plain wrong to eliminate bitplanes in Denise::translateSPF. I guess the correct approach is to move the „unusual case“ code to a later stage, into function PixelEngine::colorize(). However, this requires a couple of additional code changes. I.e., it is necessary to record all changes of BPLCON2 and replay them in colorize()(BPLCON2 decided whether the "unusual case" is present or not). Right now, only the BPLCON0 changes are replayed.

I'll try to apply this kind of fix in next couple of days...

emoon commented 4 years ago

Yeah just for the record: What the Rocklobster demo sets here is 7 bitplane (illegal) mode and HAM. This triggers a bug on OCS so the hardware will only fetch data for the 4 first bitplanes, but it will still use the data in BPL5DAT and BPL6DAT values when calculating the pixel color. In this case the demo only sets these registers once to a pattern being RGBB .. RGBB

dirkwhoffmann commented 4 years ago

Brief update:

I kept the checking code in translate(), because colorize()works on the mbuffer. It wouldn't be correct to eliminate bitplane data there, because the mbuffer already contains sprite data. I've did some decent cleanup in translate() though:

void
Denise::translateSPF(int from, int to, PFState &state)
{
    /* Check for invalid bitplane modes.
     * If the priority of the second bitplane is set to an invalid value (> 4),
     * Denise ignores the data from the first four bitplanes whereever the fifth
     * bitplane is set to 1. Some demos such as "Planet Rocklobster" (Oxyron)
     * show that this kind of bitplane elimination does not happen in HAM mode.
     *
     * Relevant tests in the vAmigaTS test suite:
     * Denise/BPLCON0/invprio0 to Denise/BPLCON0/invprio3
     */

    if (unlikely(!state.prio2 && !state.ham)) {

        for (int i = from; i < to; i++) {

             u8 s = bBuffer[i];

             assert(PixelEngine::isRgbaIndex(s));
             iBuffer[i] = mBuffer[i] = (s & 0x10) ? (s & 0x30) : s;
             zBuffer[i] = 0;
         }
        return;
    }

    // Translate the normal way
    for (int i = from; i < to; i++) {

        u8 s = bBuffer[i];

        assert(PixelEngine::isRgbaIndex(s));
        iBuffer[i] = mBuffer[i] = s;
        zBuffer[i] = s ? state.prio2 : 0;
    }
}

Withinvprio0to invprio3 there are four new tests which use invalid prio2 values in combination with various display modes. They cover the Rocklobster case as well as some others. They replace the test cases which I added yesterday (i.e., modes0e which was mentioned above).

emoon commented 4 years ago

Cool. I think it should be possible to close #438 is after this change, I did verify it with my local change, but likely good to double check.