dirkwhoffmann / vAmiga

vAmiga is a user-friendly Amiga 500, 1000, 2000 emulator for macOS
https://dirkwhoffmann.github.io/vAmiga
Other
299 stars 25 forks source link

ShowConfig utility crashes with fastram (KS/WB 3.2.1) #705

Closed nicolasbauw closed 2 years ago

nicolasbauw commented 2 years ago

At first, I thought the showconfig utility itself had an issue, so I made some additional tests:

When I say "crash" : the Amiga display becomes corrupted after a certain amount of time (like the chipram is overwritten), then software failure. I came to the conclusion that this crash could come from vAmiga, or some weird interaction between showconfig checking boards / amount of ram and the emulator, given the fact that I don't encounter that in another emulator.

mras0 commented 2 years ago

Could you test it with no HD controllers involved, but still having fast ram enabled? I.e. just a bootable WB 3.2.1 floppy with showconfig + whatever program(s) you use for testing? Maybe also same config but earlier KS version with and without HD if it's not too much trouble (i.e. WB3.2.1 + KS3.1 for example).

At least with 3.1 (latest I have available) I don't see how showconfig could cause issues, but they probably updated it quite a bit for 3.2.

nicolasbauw commented 2 years ago

Quick update (I did not have the time to create a boot floppy with the sufficient libs yet).

Booting from the "install" and running showconfig from the "extras" floppy, I don't have exactly the same UI for showconfig (probably, all the libs are not present on the boot disk) : instead of having a nice UI, it is the classical text output. That said, this text version, with or without HD controller, does not crash.

Here is a capture from fs-uae and 3.2.1:

image

Back to vAmiga, booting from HD (3.2.1) and using a slightly older showconfig from 3.2 extras disk, the window at least opens and lets me make two observations:

image

The same setup in fs-uae (boot from 3.2.1, showconfig from 3.2 extras disk) shows this instantly:

image
mras0 commented 2 years ago

And the text-based version of showconfig didn't show anything out of the ordinary (e.g. unexpected expansion boards)?

I don't know if the new GUI version of showconfig included in 3.2 does some extra probing, but 30 seconds could indicate it (or maybe the board list has become corrupted). The reason I asked if it also happens if no HD controller is involved is because it's the most likely culprit, fast ram expansion board is dead simple and handled entirely by KS.

@dirkwhoffmann unless you have 3.2.x available, it might make sense to make a debug build of vAmiga with ZOR_DEBUG/HDR_DEBUG/XFILES (and maybe other options) enabled for @nicolasbauw to test.

nicolasbauw commented 2 years ago

Like in UAE, the text version (in fact, it's the same binary than the UI version, but it opens in text mode probably because of missing libs on the boot floppy ?) shows these two additional 64KB boards, which I suppose are totally legit:

image

with HD disabled, these 64KB boards disappear. Interesting, on the crashing capture (with the 3.2 version of showconfig which at least shows a window), we see the RAM, but not these two 64KB boards (crash occurs right after the RAM board appears).

mras0 commented 2 years ago

Yep, the last two are HD controller boards that look like they're correctly configured. Do you have more than on HDF attached? If yes, could you try with only one?

nicolasbauw commented 2 years ago

I have the same behavior with a single drive attached:

I also noticed that after guru'ing, the Amiga shows the kickstart screen, instead of booting from the hard drive. I have to reset for the Amiga to boot again from the disk.

dirkwhoffmann commented 2 years ago

@dirkwhoffmann unless you have 3.2.x available, it might make sense to make a debug build of vAmiga with ZOR_DEBUG/HDR_DEBUG/XFILES (and maybe other options) enabled for @nicolasbauw to test.

@nicolasbauw: Are you able to install Xcode on your Mac? It would vastly simplify debugging. Compiling vAmiga is easy. It's basically clicking a single button in Xcode.

dirkwhoffmann commented 2 years ago

unless you have 3.2.x available

Unfortunately, I don't have 3.2 or 3.2.1 at hand.

nicolasbauw commented 2 years ago

@dirkwhoffmann sure ! I already have xcode installed.

dirkwhoffmann commented 2 years ago

@dirkwhoffmann sure ! I already have xcode installed.

OK, cool. Could you do the following?

static const int XFILES          = 1; // Report paranormal activity

static const int ACF_DEBUG       = 1; // Autoconfig
static const int FAS_DEBUG       = 1; // FastRam
static const int HDR_DEBUG       = 1; // HardDrive
static const int DBD_DEBUG       = 1; // DebugBoard

You'll see some debug output in the console windows. With some luck, it gives us a hint about what's going wrong.

nicolasbauw commented 2 years ago

I can read this line repeating several seconds: [8919] (277, 48) 2E3D74 0 BCBSD- 602C 0040 Memory:1130 XFILES (CIA): Reading a WORD from a00000

then, the address increases word by word : a00002, a00004 up to bffffc (so, including the CIA registers). and then, weird operations on chipset registers. (in fact, that's probably the moment things become out of control) right after that, a couple of "out of range" operations on chipset registers.

last operations are a loop on these two operations:

[11053] ( 25, 35) F83FE4  0 BCBSD- 2000 77FF Agnus:452 XFILES: pokeSPRxCTL: Extended VSTOP bit set
[11074] ( 25, 35) F804EC  0 BCBSD- 2000 77FF Agnus:448 XFILES: pokeSPRxCTL: Extended VSTRT bit set

then, the execution stops on this line (before I could see the Amiga guru'ing - this has probably nothing to do with the problem):

Assertion failed: (clock - DMA_CYCLES(pos.v * HPOS_CNT_PAL + pos.h) == frame.start), function startOfFrame, file Agnus.cpp, line 202.

Here is the complete trace: vAmiga-debug.txt.gz

mras0 commented 2 years ago

Unfortunately nothing conclusive from the logs. Looks like you're right that it goes off the rails at the point where it reads from a00000.

Could you try with ZOR_DEBUG=1 as well? It may give a hint as to where the invalid address comes from. You can cut off the log when after a few reads from a00000 to keep the size down.

nicolasbauw commented 2 years ago

Here's the log with ZOR_DEBUG. I don't see anything obvious.

The Assertion failed: (clock - DMA_CYCLES(pos.v * HPOS_CNT_PAL + pos.h) == frame.start), function startOfFrame, file Agnus.cpp, line 202. happens at the exact moment of the guru meditation.

from this website it seems that at a00000 there is a 512 KB space for PCMCIA (600/1200) OR a Z2 expansion. Maybe this has something to do with that.

vAmiga-debug-zor.txt.gz

mras0 commented 2 years ago

Thanks, at least that shows that it's not because it's reading something unexpected from the HD controller ROM just before. I think vAmiga correctly handles $a00000 read like a very weird CIA access, but the problem code probably happens way before that as it shouldn't try to access that address.

I think short term only way forward is if Dirk can reproduce it locally (WB 3.2 should work with KS 3.1 as far as I understand), so a minimal HDF with showconfig+libraries might be kosher to share privately, but I won't comment further on that.

nicolasbauw commented 2 years ago

With slowram, here's what happens:

In this scenario (no fast ram, slow ram), there is no crash, and there is a scan at the same memory addresses, so it seems scanning CIA address space isn't what causes the problem.

Interesting, in this scenario, I don't have the reading of a00000 repeating for several seconds at the start of the memory scan. So I tried with fastmem AND slowram (so, with an expansion still present at a00000), and in this case this repeated read of a00000 happens and end in a crash like in previous logs.

vAmiga-debug-slowram.txt.gz

dirkwhoffmann commented 2 years ago

I've ordered a copy of AmigaOS 3.2 today. Once it arrives, I can do some local experiments.

dirkwhoffmann commented 2 years ago

There is an easy solution (all credit goes to @mras0). ShowConfig freaks out if a device identifies itself with an unknown serial number that doesn't fit into 16 bit. Hence, reducing the serial number for the Ram expansion board from 2718281 to 27182 and for the hard drive controller board from 3141592 to 31415 fixes the issue:

Bildschirmfoto 2022-06-06 um 17 14 47
mras0 commented 2 years ago

For future reference.

I can't claim full credit. With some extra debug info from Dirk it turned out to be a known issue with boards.library (available from aminet).

The bug happens when the combination of manufacturer/productid is not matched AND the serial number is > $ffff.

The library reads manufacturer/product ids and searches for a friendly name in a local database. If found, everything works which is why WinUAE doesn't show the bug (its combinations are known). The relevant part of the code looks like this:

    ; A2 = ConfigDev
    MOVE.W  cd_Rom_er_Manufacturer(A2), D2
    MOVE.B  cd_Rom_er_Product(A2), D3
    MOVE.L  cd_Rom_er_SerialNumber(A2), D4
    MOVE.L  D4, D7
    BSR.W   SeeIfSerialNumberIsSpecial ; Does not change D4
        ; ...
    MOVEA.L A4, A2      ; A4 contains manufacturer table
    MOVE.W  (A2)+, D4   ; Get table length as word in D4
.Search
    CMP.W   (A2)+, D2   ; Match?
    BNE.B   .NoMatch
        ; Handle match (with similar problematic loop looking for product id)
.NoMatch
    ADDQ.L  #$04, A2    ; Move to entry size
    SUBQ.L  #$01, D4    ; Decrement count (bug part1 - longword)
    TST.L   D4          ; Bug part 2 - longword
    BEQ.B   .Done
    MOVE.W  (A2)+, D5   ; How much to skip (product IDs etc..)
    LSL.W   #$03, D5
    ADDA.W  D5, A2
    BRA.B   .Search
.Done

Workaround (that I will take credit for 😄) is to ensure upper 16-bits of serial number are zero => things work as planned.

dirkwhoffmann commented 2 years ago

Fixed in v2.1b2