sorgelig / ZX_Spectrum-128K_MIST

ZX Spectrum 128K for MIST Board
GNU General Public License v2.0
31 stars 18 forks source link

Timing improvements #19

Closed gyurco closed 6 years ago

gyurco commented 6 years ago

After the last patch, I discovered some instabilities at turbo clocks, so I decided to finally close the timings. Couldn't fully achive the goal, but it is near: about 103 MHz for the slowest model (it was about 60MHz before, and the SDRAM pins were not constrained). So one still cannot be sure if the core is absolutely stable after a synthesis, but if it is not stable, usually the first thing to spot the display is distorted in various ways. For me, the compiled code is rock stable now on my MiST (after some days of testing).

sorgelig commented 6 years ago

Timing optimization was always pure magic to me :) So, basically you've set setup/hold = 2 for all violated timings in Time Quest? And split CE of course.

sorgelig commented 6 years ago

There are many interesting cores on MiSTer. And ZX has HDMI output with more flexibility for output (frame locked or frame converted), so any TV can display it.

gyurco commented 6 years ago

I set setup/hold times to 2 where the actual register write is in a CE block, or in case of the T80, every other clock tick is used, so the signal can safely propagate 2 clock cycles from the start to end. Actually lot of combinational paths are too long (10-12ns, and even longer) to fit in a 112MHz clock cycle (8,9ns), and seems letting Quartus know about it can help achieve a better fit. Unfortunately couldn't be used everywhere (e.g. wd1793:fdd| couldn't work), because not all part is protected by CE. However specifying the most failing paths seems to help optimizing the others. Splitting CE helps because the register can be put physically near to the places where it is really used -> less delay. Unfortunately timing seems to be a very important part, since FPGA is not like software, when a CPU is slow, your program will be slow, but here the clock don't wait for who falls behind. And I'm still not a pro in this area, but struggled for about a week until I could achieve this with the ZX core.

Yepp, I'm following MiSTer, just I have my MiST, and I'm still happy with it :) There's something catchy with the analogue output of that. But if I have to decide now what to buy, probably I would go with MiSTer.

sorgelig commented 6 years ago

MiSTer also provides analog output through daughter board. But Cyclone V is much more sensitive to timings.

gyurco commented 6 years ago

Probably because it is much larger, and more delays can occur between nodes far away. I think fixing the timings can worth the time, because it is very frustrating to wait for 15 minutes for the synthesis just to realize the core doesn't even start because of some signals don't arrive at time. Or change one non-related line in some random place, and the whole thing goes unstable.

But for the ZX core, I can be thankful that you already did the hard job - converting to a single clock. I'm trying to resurrect the Atari ST core, but is is a mess clock-wise. And if a clock is not supplied by the PLL, the timings cannot be optimized automatically by Quartus. Defining custom clocks in the sdc for some signals generated by code is not the way to go.

gyurco commented 6 years ago

Btw, what I've learned, defining timing constraints are not that hard, here's a checklist of mine:

And these should be enough for a basic design. If the design fails the timing checks, then you must introduce tricks in the code and probably in the sdc. (e.g. reduce the clock, or just pseudo-reduce it via a CE signal in every other tick, but then tell Quartus to allow two clock cycles with set_multicycle_path, duplicate registers to allow them to put near to its uses, reduce combinatonial logic, and so on).

sorgelig commented 6 years ago

My own cores like ZX, Apogee, Sam Coupe, BK0011M, Vector06C, etc.. - they are all use a single master clock. This alone gives a boost in stability even without using Time Quest. Other cores depending on complexity and original structure I try to minimize the clocks amount and use PLL for all master clocks. generated clock are true evil :) Some devs like to use them a lot. So, sometimes it takes a lot of time to eliminate them. Or if core isn't big i just leave them as-is if core more or less working :) Cyclone V is more sensitive to timings not because of size, there are larger Cyclone IV FPGAs without such issue. It's because design flaw of Cyclone V in clock networks.

There are 2 large but very popular cores in MiSTer - ao486 and Minimig - both of them severely suffer from timing issues. So, devs like you are really needed :)

gyurco commented 6 years ago

I would happily help, just need to get some knowledge of the actual hardware, too - where I'm a bit limited (the ZX Spectrum is the only real machine which I had personally). But learning! Just started with the ST core, which seems doesn't exists in MiSTer yet (another one I miss is the Amstrad CPC one, but I know, its code is a bit strange). ao486 could be extremely frustrating, as I read on the Atari forums that it takes more than an hour to compile. Maybe a 8086 PC variant could come first :)

sorgelig commented 6 years ago

I'm working on Amstrad. It's badly written, but i'm trying :) It has most awkward way to access SD card and will take some time to completely rewrite SD code.

there is no 8086 PC. Do you mean Next186? It's not so interesting, honestly.. No flat memory support -so many late 80x and early 90x games won't work as they rely on dos4gw. Also it's fun to run Win95 which is not available for 8086. But if you really want then it's OK. It uses direct SD card access and tricky bootstrap which you will need to ovecome somehow. ao486 is more like modern PC with most basic HW implemented like HDD, RTC, even COM port. The only major thing is missing - fpu.

sorgelig commented 6 years ago

Hmm.. It looks like Amstrad 6128 uses the same uPD765 FDC. I will try to adapt your module. Does your module recognize DSK format automatically like amount of tracks, sides, sector size/amount?

sorgelig commented 6 years ago

in spectrum.sv:

    .nRD(plus3_fdd & ~nIORQ & nM1 & nRD),
    .nWR(plus3_fdd & ~nIORQ & nM1 & nWR),

doesn't look right. If plus3_fdd == 0, then both nRD and nWR will be active. I see you check in u765 that only nRD or only nWR should be active, but still this logic is not good.

sorgelig commented 6 years ago

it should look like this:

    .nRD(~plus3_fdd | nIORQ | ~nM1 | nRD),
    .nWR(~plus3_fdd | nIORQ | ~nM1 | nWR),
sorgelig commented 6 years ago

Amstrad works with your u765 module!

gyurco commented 6 years ago

Yepp, seems the bug was mitigated by u765, but still not good.

Thats great news, maybe some copy protected titles will work with Amstrad, too (I think some CPC images use the multiple copy of one sector feature of the EDSK format, which I didn't implement, since Speccy images are not using it).

sorgelig commented 6 years ago

So, you can improve u765 when i will release Amstrad. I found strange behavior in Amstrad: if i change the disk, then CAT command gives messed result. Reset (warm reset not bound to u765) fixes the problem. So, at first, the problem looks like is in Amstrad ROM handling the disk access. But i think it can be caused by u765 incorrectly or not timely reporting about new disk.

gyurco commented 6 years ago

Of course, if it can be backported to MiST :) Hopefully your changes will greatly clean up its code. I'm not aware of anything in u765 which reports a changed disk, the +3 DOS detects changes by reading a sector and calculates a checksum from it. And it cannot detect when a double-sided disk is changed to a single-sided one (verified it in an emulator, too), only reset helps.

gyurco commented 6 years ago

If you just boot the core, and not insert a disk, a CAT will properly write "Drive A: disc missing?"

gyurco commented 6 years ago

I've looked at the CPC6128 schematic, and seems the difference between the +3 and the CPC is that the upd765 TC input signal is connected in the Amstrad machine to the reset. But reset terminates every command anyway, so don't think it is the problem.

sorgelig commented 6 years ago

yes, without disk it reports "Drive A: disc missing." if i will mount the disk, i will get garbled catalog and even disc missing message. But if i mount the image first and then use cat - everything is OK.

gyurco commented 6 years ago

Would be good to issue some commands to the FDC and see what it returns in this state...unfortunately I don't even know what in/out ports the CPC uses for it.

sorgelig commented 6 years ago

http://www.cpcwiki.eu/index.php/765_FDC

gyurco commented 6 years ago

Ok, probably I found something which was not mandatory for the +3, I'll post a patch and you can see if it helps. No, false alarm, sorry.

sorgelig commented 6 years ago

Here is refactored Amstrad with your u765 module: https://github.com/sorgelig/Amstrad_MiST So you can try the issue yourself now :) I've checked with original CoreAmstrad - there is no such issue. The code basically the same. What i think, is may be something lays in motor control part. Like giving error for first command after disk change letting the Amstrad know that something changed in FDC and need to re-calibrate and re-read the catalog. Currently I ignore the motor commands.

gyurco commented 6 years ago

That's great, now something which I can play countless hours again :) I don't know if it is the motor control, on the +3 I totally ignored the motor on/off bit, and it is working.

I've tried some direct IO in an emulator (also in the old core), and they accepts commands without errors even after a disc change (they monitor the motor, so first one have to turn on, but I don't think it can be an issue - the machine doesn't know if a motor ON is successful, it's just one bit.).

gyurco commented 6 years ago

Seems issuing even a simple RECALIBRATE command after the disc change makes it work: OUT &7BF7,7 OUT &7BF7,0

And there are cases, when it just works. I suspect something in the internal state of the controller not resetting correctly. Maybe timing issues? I see the core has even higher clock than the ZX (128 MHz), and there are 2 high frequency asynchronous clocks (128 and 112 MHz).

sorgelig commented 6 years ago

system clock is the same as ZX - 112MHz. 128MHz is only for scandoubler with HQ2x because it requires 8 times higher clock than pixel clock which is 16MHz. I've tried to execute these OUT commands after disk change - it doesn't help. OS still read wrong data in catalog. There are bits "Not Ready" and D7,D6 bits in Status 0 - they provide the way to detect the disk change. But i didn't figure out how they reset the status - uPD765 docs aren't presize enough.

It seems need to disassemble AMSDOS.ROM to understand the logic, but for me it's too hard since i'm not familiar with Amstrad and need to study its BIOS/DOS.. May be you are more familiar with this.

sorgelig commented 6 years ago

ah, you've mistyped the port number. Yeah, it works! So, need to see how it affects the u765.

sorgelig commented 6 years ago

i believe the current track buffer not marked as dirty or something like this. So, module still holds the old data.

gyurco commented 6 years ago

Oh my...sorry for the typo. But well, recalibrate did the trick not only by luck :) Sent a pull request to the new core.

gyurco commented 6 years ago

...which needs a little bit of improvement.

sorgelig commented 6 years ago

I've added auto-seek to 0 after image scan, so it works now. But i think it's not quite correct. What i've noticed (from serial console log) is that read LBA=0 is not issued upon first time catalog is read. Thus re-calibrating procedure reads LBA 0. Check u765 if there is some false condition happened to prevent read LBA0, or there is some other issues. Another potential issue i've observed - there is no wait for seek. For example re-calibrate command finishes before actual seek finish, so CPU potentially can issue another command. Main state machine should wait all processes before go to IDLE state.

gyurco commented 6 years ago

Well, seek is asynchronous in the upd765, so it is possible to issue new commands during it (however since the seek time is not emulated, it is very fast). What can be added is the commands probably return an error if they're executed during a seek in progress (don't know what happens with the real thing).

sorgelig commented 6 years ago

i've pushed the fix to repository

sorgelig commented 6 years ago

i've removed "ready" flag check from seek process. It's better to keep seek process as internal service which can be executed even when commands are prohibited externally.

sorgelig commented 6 years ago

I think, if it's correct to add motor flag to ready flag to prohibit the commands when motor is off. Currently it works without motor flag, but in real HW it shouldn't.

gyurco commented 6 years ago

On the real world, the ready comes from the FDD, so I think it is ok to error out even if a seek in progress (this internal service could emulate the head stepping, just I didn't find any practical reason to do it). And for the same reason, it is very OK to add the motor bit on it.

I saw your fix, it's similar to my first version, just I wanted to rework it to not have the int_state set to 1 after it (like if somebody did a 'real' seek).

gyurco commented 6 years ago

Well, further thinking...if the motor is off and you change the image, then it'll be a problem if the ready flag means error.

sorgelig commented 6 years ago

that's why i've removed ready from seek process.

gyurco commented 6 years ago

Yepp, I understand now. So let's leave it removed. Meanwhile I discovered another incompatibility with some CPC disks, so there's else something to do :) Btw, this cleaned up core rocks, no more problem with the ROM loading. I guess nobody can thank you enough the time you spent with it :)

sorgelig commented 6 years ago

I never had or played Amstrad, so i'm not sure how it's important to have different ROM sets. I will see feedback and will decide what to do. On MiSTer i can add special support on ARM side and have ROMs splited to 16KB parts like in original core.

sorgelig commented 6 years ago

I'm glad that you are willing to tweak u765 for better Amstrad compatibility!

gyurco commented 6 years ago

I definitely more like stable ROM loading, not just "will it load or not?".

It is nothing compared to your amount of work :) And compared to others, who did whole machines. I wonder what are you using for debugging? USBBlaster + SignalTap II?

sorgelig commented 6 years ago

USB Blaster + SignalTap. Also simulating the blocks inside my brain :)

gyurco commented 6 years ago

My In-memory-debug device started to reach its limits :) So just ordered an UsbBlaster.