Closed pawosm-arm closed 2 years ago
Just some diagnosis notes. The non recovery until power cycle apears to be a bug in the +3e firmware as it doesn't know how to get out of a disk that is waiting for bytes for an I/O rather than idle or waiting to be put into SPI mode.
That in turn means that whatever happened left the device thinking there was more data due to appear on the write.
I did some investigation. First, I obtained reference Z80 assembly code for init, sector read and sector write from the ZXMMC manufacturer. Having this, I wrote somehow oversimplified ZXMMC driver for CP/M Plus (see here: https://sourceforge.net/projects/zxmmcfid ). It proved reference code was good for read/write operations and CP/M never stuck. Then I prepared a list of differences and examined them one by one. Some of the things were done better in Fuzix (e.g. I don't wait for 0xff before sector write, I guess it may cause problem on an overclocked machine, but, hey, I prepared this for the real thing!), yet some things were 'less paranoid' on Fuzix (e.g. I'm doing longer waits and sending more clocks). Eventually I came up with this very last thing that finally made Fuzix work on the real Speccy with the real ZXMMC! Turned out, things that are happening after sector read/write should not be ignored. Here's what I changed to make it work (feel free to use it to make a proper fix):
diff --git a/Kernel/dev/devsd.c b/Kernel/dev/devsd.c
index 4581295b..5a9b17e3 100644
--- a/Kernel/dev/devsd.c
+++ b/Kernel/dev/devsd.c
@@ -50,6 +50,8 @@ uint_fast8_t devsd_transfer_sector(void)
sd_spi_transmit_byte(0xFF); /* dummy CRC */
sd_spi_transmit_byte(0xFF);
success = ((sd_spi_wait(false) & 0x1F) == 0x05);
+ if (success)
+ while (sd_spi_wait(false) == 0x00) { }
}
}
}else
diff --git a/Kernel/dev/zx/zxmmc.c b/Kernel/dev/zx/zxmmc.c
index d9d3756c..219793ce 100644
--- a/Kernel/dev/zx/zxmmc.c
+++ b/Kernel/dev/zx/zxmmc.c
@@ -66,6 +66,10 @@ doread:
inir
ld a,(_vtborder)
out (0xfe),a
+ in a, (0x3f)
+ nop
+ nop
+ in a, (0x3f)
pop af
or a
jp nz,map_kernel
Also note that the two bytes input'ed after sector read are zeros in the FUSE emulator and they are CRC16 on the real thing (despite CRC checking being turned off on ZXMMC), so my CP/M driver when encounters non-zero value here, computes and compares CRC16 for the obtained sector.
One annoying thing I've found. I closed the system with the shutdown
command. Apparently, it does not unmount rootfs. As a result, on subsequent boot fsck
is performed. It's time consuming on Speccy and it asks way too many questions. Also, it warns about sd timeout several times, which is particularly worrying.
I've made my changes on git SHA 3168f915ca851e6331d41f87adbc26d6ecfb780c (merge commit from Dec 26 2019). Yet as I've built the most recent master, init panicked right after start with the rootfs I had on my MMC card. Is it expected or is it an regression?
Nah, those extra clocks are extremely needed. Without them Fuzix hangs after a while (not that spectacularly as before and plain reset helps, yet still). Additional necessary change needed is following:
diff --git a/Kernel/dev/devsd.c b/Kernel/dev/devsd.c
index 4581295b..a41f5717 100644
--- a/Kernel/dev/devsd.c
+++ b/Kernel/dev/devsd.c
@@ -71,6 +73,9 @@ void sd_spi_release(void)
{
sd_spi_raise_cs();
sd_spi_receive_byte();
+ sd_spi_receive_byte();
+ sd_spi_receive_byte();
+ sd_spi_receive_byte();
}
uint8_t sd_spi_wait(bool want_ff)
Sadly, it still hangs randomly, even with the change above :(
(but... if it doesn't block the device, maybe it's not the I/O issue and the last change above isn't really needed? In the end, it does not hang on time consuming fsck, neither with or without the change above; from the other side, when it's hung, the diode of the ZXMMC device is lit, I suspect it's hung on the success
branch while
loop from my first patch)
So, I turned this potentially infinite loop into the 32767 iterations. The problem with blocked interface returned. I assume, when "sd: failed, retrying.\n"
is printed and the SD command is re-sent, the device gets stuck.
Just to shout from the sidelines here; some random thoughts on SD cards, and initialisation issues.
Time for an SD Card is standing still unless it is seeing CLKs. So you want to keep sending preparatory 0xFF
until you get the right preparatory response, which is also usually a 0xFF
.
There's no reason to send more receive bytes after raising CS to deselect. Opinion is divided as to whether the further 8 CLK are needed before or after raising the CS.
Some of the initialisation timeouts are really long. 1000ms long. And some SD Cards need this time, although most don't.
And, between the two commands of an ACMD there really needs to be a wait ready looking for an 0xFF
response. Putting 8 CLKs in between works in almost all Cards, but perhaps not for your Card.
Now I got something that works for a long while. I guess it helped for good. I've removed use of the timer and did waiting the same way it's done in my CP/M driver. Complete change looks like this:
diff --git a/Kernel/dev/devsd.c b/Kernel/dev/devsd.c
index 4581295b..216baf63 100644
--- a/Kernel/dev/devsd.c
+++ b/Kernel/dev/devsd.c
@@ -50,6 +50,8 @@ uint_fast8_t devsd_transfer_sector(void)
sd_spi_transmit_byte(0xFF); /* dummy CRC */
sd_spi_transmit_byte(0xFF);
success = ((sd_spi_wait(false) & 0x1F) == 0x05);
+ if (success)
+ while (sd_spi_wait(false) == 0x00) { }
}
}
}else
@@ -75,12 +77,10 @@ void sd_spi_release(void)
uint8_t sd_spi_wait(bool want_ff)
{
- unsigned int timer;
+ int repeats = 12800;
uint_fast8_t b;
- timer = set_timer_ms(500);
-
- while(true){
+ while(--repeats){
b = sd_spi_receive_byte();
if(want_ff){
if(b == 0xFF)
@@ -89,10 +89,6 @@ uint8_t sd_spi_wait(bool want_ff)
if(b != 0xFF)
break;
}
- if(timer_expired(timer)){
- kputs("sd: timeout\n");
- break;
- }
}
return b;
diff --git a/Kernel/dev/zx/zxmmc.c b/Kernel/dev/zx/zxmmc.c
index d9d3756c..219793ce 100644
--- a/Kernel/dev/zx/zxmmc.c
+++ b/Kernel/dev/zx/zxmmc.c
@@ -66,6 +66,10 @@ doread:
inir
ld a,(_vtborder)
out (0xfe),a
+ in a, (0x3f)
+ nop
+ nop
+ in a, (0x3f)
pop af
or a
jp nz,map_kernel
The first change looks right
The second one is interesting. I need to dig into that because it actually implies either your card takes > 0.5 seconds to complete (which I doubt), or something is messed up in the timer/timeout handling. Definitely need to work out what is up.
The in nop nop in implies the hardware needs 16 clocks between accesses ? (since inir works and presumably in nop in wasn't good enough)
shutdown does btw mark the file system as clean so if that is still failing something is still not getting correctly to the media, or you built a current kernel and not matching user space. Right now that's a mess because I am in the middle of changing the binary format and it will make some stuff explode like signal handling. Fairly soon I'll push the rest of the new format and then old binaries won't work at all so all will be sane again 8). Something I need to do because the original design has some bad mistakes in it and I need to fix them to get things like 8080 binaries working on 8080 or Z80 kernels and to fix up other braindead API issues like relying on RST 30 for a syscall which means you can't run Fuzix when you've got a fixed ROM in the low 16K like on the Pentagon.
The two nop
s between in
s I've taken from the reference code, it's full of pair of nop
s in places like that (e.g. between two inir
s or two otir
s). I assume, changing border colour between inir
s or otir
s that Fuzix does, should also do.
The shutdown
command indeed does what it should now (after applying the most recent patch). Yet I didn't check if halt
or reboot
are still problematic as facing fsck
on an old machine is painfully slow (same issue I'm seeing with FreeDOS and chkdsk
started on Amstrad PC 2086 to which I've installed XT-IDE 8-bit ISA card and a 128MB Compact Flash storage, checking filesystem reports time elapsed around 900sec!).
Speaking of the rootfs
. Is there a reason the TERM
variable is not set? Some of the programs refuse to run because of that. Also, I felt disappointed that fforth
failed to start (for a different reason), I like the language.
You can set TERM in your .profile or for a fixed console in /etc/inittab.
TERM=vt52; export TERM
(the modern export TERM=vt52 isn't supported by the classic Bourne shell)
If you change it from
getty /dev/tty1
to
getty /dev/tty1 115200 vt52
it will then set the terminal type before running your shell.
halt and reboot don't tidy up - they are implemented the traditional unix way as root only tools that do directly do what they say.
fforth isn't built for z80 or 8080 atm - @davidgiven does some C and preprocessor tricks that sdcc and ackcc can't cope with. Even if you preprocess it via gcc and the sdcc headers you end up with stuff sdcc can't compile.
You can set TERM in your .profile or for a fixed console in /etc/inittab.
Indeed, doing sed
on the inittab
file helped, TERM
is set now. Could be a default. Not sure if 115200 should be a default, my full blown Linux systems have 38400 or even 9600 set...
Seems like this thread bloats into off-topics, hence my extended question. Is there a general space to note some of my 'user experiecne' questions? For example:
clear
or reset
(or both), chroot
possibly with /dev
rebinding (if I finally manage to mount anything external), is there a TODO list for such stuff?cpm-loader
can load and execute CP/M's .com
files).reboot
should unmount rootfs before causing reset...Not really - ok there's a fuzix-devel google group but it's gone a bit quiet
You don't. The cpm tools use raw disk access like cpmtools on Linux.
...but things like cpmls
or cpmcp
don't seem to exist on the rootfs...
chroot
isn't there as a tool
I've found extern int chroot(const char *path);
declared in the syscalls.h
file, so I guess it's a part of the userspace runtime library.
I don't think the default file system creation includes them
chroot is both a system call and a command. The system call is supported, the chroot command isn't written.
I've built Fuzix with today's master and booted it using http://www.fuzix.org/downloads/0.4pre/rootfs-z80-32.gz root filesystem. As I can see, most of the changes mentioned here were commited into the code, yet one change is still needed to make it boot on the real Spectrum with ZXMMC storage:
--- a/Kernel/dev/devsd.c
+++ b/Kernel/dev/devsd.c
@@ -85,12 +85,10 @@ void sd_spi_release(void)
uint8_t sd_spi_wait(bool want_ff)
{
- unsigned int timer;
+ int repeats = 12800;
uint_fast8_t b;
- timer = set_timer_ms(500);
-
- while(true){
+ while(--repeats){
b = sd_spi_receive_byte();
if(want_ff){
if(b == 0xFF)
@@ -99,10 +97,6 @@ uint8_t sd_spi_wait(bool want_ff)
if(b != 0xFF)
break;
}
- if(timer_expired(timer)){
- kputs("sd: timeout\n");
- break;
- }
}
return b;
The two new ports merged this week change some of the SD behaviour due to behaviour that seems very close to this so hopefully it's now resolved.
The good news is that now it is usable on the real thing without any patching. The bad news is that when login:
prompt is displayed, the ZXMMC diode is on all of the time (the border isn't flashing at the same time) until I log in (as root). It seems worrying (what this device is doing with my MMC card at that time?!), even if everything is back to normal right after I log in and the system seems fully useable.
That could be because CS is still asserted --- with my changed logic it will remain asserted after a write until the next command. It should be harmless but um. It's still not desirable. At least it seems to be verifying that busy-waiting on the write completion is what was making the card unhappy.
Does the kernel have any asynchronous timer mechanism which will allow a driver callback to fired while user code is running?
Most of the boards I have wire the light to /CS so I would guess the ZXMMC did the same. It's good that it works so we now know what is up.
It's possible to add but I've tried to avoid putting more and more goo in the interrupt paths. Every platform with a timer interrupt will pass through process.c:timer_interrupt regularly when interrupts are on and through the core part of it (doing alarms, etc) 10 times a second.
As in the topic. I could not complete init sequence due to the failure as in the screenshot: After this failure,
ZXMMC
device remains stuck even after reset: power cycle is always required to make it work again. It is suspected that the failure occurs on the first write attempt.