Closed Wack0 closed 2 months ago
Here is a test build that may fix the issue. If not, I'll make a build with all the PCI bus enumeration testing code in... nt_arcfw_grackle_fwonly_test20240714_1248.zip
Same problem with the most recent build (1536)
Hmm, not even displaying an additional message for detecting Yosemite?
I'll make another build with more debug logs as soon as possible.
Right, just one dot then the freeze, nothing else displayed.
OK, here's a build with the PCI debug output present in the HAL:
This is still all I get. Just to confirm, I repartitioned the drive, ran setupldr, added the mass storage devices (2), selected the Mac type and video type, then it hangs here.
Looking further, I think the Cuda driver in the HAL may be incorrect. (well, does things different that what I thought)
I made some changes, does this work?
Unfortunately it does the exact same as before.
Made another change, and also added a bunch of debug output to the cuda init:
well it’s showing the stop error now, progress!
also I believe it made it further than it had been before the halt
yeah, now it is a pci enumeration issue.
this one has more PCI debug output, should hopefully say what's causing the machine check exception:
(also removed the cuda debug output because it's no longer needed)
Just noticed something in the linux sources, where on Yosemite it turns off master abort mode in the PCI-PCI bridge, so I added code to do the same thing in the HAL when finding a PCI-PCI bridge - maybe that'll help.
Also trying on a B&W G3 - version nt_arcfw_grackle_fwonly_test20240714_2237.zip has the same result.
Yep I get the same as Sean
huh, maybe a timing issue on the cuda init?
I’d say your guess is as good as mine, but your guess is actually 100x better than mine.
When porting pciutils to Mac OS X for PowerPC Macs, I found that probing non-existent devices behind the built-in PCI-PCI bridge of a B&W G3 causes a check exception so now it has a special case where it only probes PCI devices that exist in the device-tree. https://github.com/joevt/directhw/blob/a1d987fdea92ddf19cc877c4acef9b1ca2e06072/DirectHW/DirectHW.cpp#L1222-L1278
Would be nice if there was a flag to disable the machine check exception for this case. Doesn't Grackle have a flag for that? I have to check the MPC106 manual...
Reading from a non-existent device should just return -1 for every byte in the config space (not true for hidden Thunderbolt devices which return -1 only for vendor and device ID).
Added some more short delays during cuda init:
@joevt According to the 21154 datasheet, the master abort mode (which linux specifically disables) is responsible for the behaviour you describe:
"Controls the 21154’s behavior when a master abort termination occurs in response to a transaction initiated by the 21154 on either the primary or secondary PCI interface. When 0: The 21154 asserts TRDY# on the initiator bus for delayed transactions, and FFFF FFFFh for read transactions. For posted write transactions, p_serr_l is not asserted. When 1: The 21154 returns a target abort on the initiator bus for delayed transactions. For posted write transactions, the 21154 asserts p_serr_l if the SERR# enable bit is set in the command register. Reset value: 0."
It appears that probe-slots
in yosemite's OF does enable this bit.
It appears that
probe-slots
in yosemite's OF does enable this bit.
Yes. It's a standard PCI-PCI bridge register but this only happens for the built-in DEC21154 PCI bridge - not any other bridges (including other DEC21154 bridges that you may add).
The bridge control register is set to 0x0326 (hard coded) after the slots are probed. 1: SERR# Enable 2: ISA Enable 5: Master-Abort Mode 8: Primary Discard Timer 9: Secondary Discard Timer
The bridge control register is set to 0x03a6 before Open Firmware. 7: Fast Back-to-Back Enable ... + all the other bits mentioned above.
Any other PCI bridges that are not the built-in DEC21154 PCI bridge get their bridge control registers set to 0 before probing and 4 (ISA Enable) after probing.
latest build results ^^^
ok, is it definitely freezing there? boot should continue after that, either with INACCESSIBLE_BOOT_DEVICE or getting into text setup.
can you try not loading the general HID and storage driver? if it boots to a keyboard error, this would imply more issues in the Cuda driver...
Fairly certain, the optical drive turned off as it has done in the past when the system freezes and I let it sit there for a couple of minutes with no progress - but I will recheck again after I return home from work this afternoon.
Same result here.
@ActionRetro is that with not loading the general HID and storage driver?
Oh no, I'll try that
if that boots to a keyboard error, then I know it's a HAL Cuda driver problem, otherwise something else (and probably related to the PCI IDE controller in some way, atapi.sys does load so it will try to use it)
Ok, without HID and storage driver, I got:
OK, I meant with the mac i/o ide driver and not the "generic HID and storage" driver; but that actually helps! it shows me that atapi.sys initialiased fine, so it definitely is a cuda driver problem.
In that case, I noticed the linux driver did readback IER after writing to it to disable all MCU interrupts, so I tried that (instead of a delay after setting it on init). Might not work, but I guess linux driver did specifically do that for a reason, so:
Given the previous build did not work (on a trayloading imac), I added some debug output for Cuda, hopefully the crash isn't what I think it is...
(I removed the PCI bus enumeration debug code, as it's not needed anymore)
Debug output on the Trayloader:
huh, wasn't expecting that output... so the problem is ADB commands, specifically.
this one will give slightly more debug output.
...I think I figured out what the problem was, how did this ever work, even under emulation??? (forgot to check if cuda was finished sending data, something I implemented in the ARC firmware even!)
I removed all the debug output, I can add it back if this still freezes.
I am now in the installer, will continue the install and report back!
I assume the keyboard works fine in text setup?
Yes, I am currently at the formatting stage of the install.
In that case, this issue is fixed, I'll get a release together.
As reported by @JonObst here: https://github.com/Wack0/maciNTosh/issues/3#issuecomment-2227171714
Most likely an issue with PCI bus enumeration in the HAL.