c0pperdragon / OS816

An experimental single-board computer using the 65C816
19 stars 2 forks source link

Feature queries: VIA and strobes #1

Open gadyke opened 2 years ago

gadyke commented 2 years ago

Really impressed by the approach on this board and how elegant the approach to the limitations/quirks of the chip are. I had a couple of thoughts on apparently missing pieces that I'd welcome thoughts on:

Very happy to be kept honest on all of this!

Thanks,

Greg

c0pperdragon commented 2 years ago

It seems that you are extremely lucky in this respect. The OS816 memory space is divided into 4 parts that are addressed by the address bits A23 and A22: 00 = RAM 01 = I/O 10 = unused 11 = FLASH

With the 3rd block unused (addresses 80000 - BFFFF), you can easily map the VIA chip into this range. Even better, the VIA has two chip select lines which you can directly connect to A23 and A22 to properly select the chip. As the VIA is specifically designed to work with the signals from the 65c02 and 65c816, you only need to wire up the clock and the RWB signal directly.

When you want to use an existing OS816 board, you will need to tap into the address and data bus and the control signals as well. Unluckily, there is no single chip that has all signals (even the CPU does not provide A23 and A23, as these are outputs of the latch). Your best approuch would be to solder wires to the underside of the relevant sockets.

When you design your own board, or use a breadboard, you are of course completey free.

HeathenUK commented 2 years ago

(Apologies for the confusion, switching out of the other account which is otherwise inactive)

Thanks for getting back to me, hugely appreciated!

I'm looking to adapt your design for my own PCB so as you say wiring to the A22/23 decoded signals is no trouble.

I'm pondering the fact that I don't need to use the '138 in order to add the VIA to the design (since, as you say, it understands the RWB signal just fine) and then adding on my thought above about adding at least one other built-in peripheral to the design (which will need a write strobe), do you see any flaw in:

  1. Rather than wiring the VIA up to the A22/A23 lines, instead wiring its CE pins to two of the unused A19/A20/A21 pins. As before, relying on the RWB signal direct from the 816; and
  2. As well as (1), using the now-spare A22/A23(=0/1) decode on the '138 to add a new read/write strobe combo hooked up to the peripheral IC (which would then be addressable at 0x800000 as above).

I don't see how there could be a problem with (2), but I'm trying to work through in my mind whether under (1) the VIA would clash with the '138 decode logic for the RAM since the RAM rd/wr strobes would (I think) fire during any read/write to the lower address bits as this would leave A23/A22 at 0/0.

EDIT: I suppose I could sacrifice the reverse combination on the '138 (A23/A22 at 0/1) and lose the input/output port to put the extra peripheral on 0x400000 and keep the VIA on A23/A22 at 1/0 as originally planned? I don't think that risks clashing with RAM...

Thanks again,

Greg

c0pperdragon commented 2 years ago

You are quite right. Approach 1 would definitely cause a bus conflict with the RAM on any read access.

When you already use a VIA, you will probably not need the simple I/O circuit anyway, so the address space at 40000 is then free for your additional peripherial.
With this proper (read: expensive) solution you could even start to utilize interrupts.

HeathenUK commented 2 years ago

Thanks for running the thought experiment with me!

I have a couple of follow-up thoughts if you don't mind:

  1. I assume there's no issue with tying the same clock to the VIA as the 816? I'm still getting my head around the A16-23/Data multiplexing and exactly when different states will be valid at different points in the instruction cycles. I think it's probably fine as the '573 will have latched at the end of the previous cycle (via #CLK) to the one clocking the VIA (next CLK). The only thing that gives me pause is that the '138 decodes on CLKFAST.

  2. Am I correct in thinking that the full range of lower address bits are available to any device strobing on A23/A22 at 1/0 (i.e. replacing the single I/O port address)? That is to say, if my peripheral exposes a few register address pins (rather than just a data bus and a write pin) there's no reason this would clash at all with RAM in the way option 1 above would have done?

And way ahead of you on interrupts! I fully intend to use the timer on the VIA so will indeed hook up IRQB (I'll leave NMI pinned high).

Regards,

Greg

c0pperdragon commented 2 years ago

The bus timing is indeed a bit tricky with this processor. Every clock cycle is divided into two halves. The cycle starts when the clock line goes low. There the CPU will put the high address bits on the data bus and the low address bits on the address bus (and also sets RWB according to the access type). In the second half of the cycle (clock high), the CPU will still keep the low address bits on the address buts, but will either release the data bus (for read accesses) or put the outging data there (for write accesses).

A peripherical chip that needs to work with this protocol, also has to tri-state the data bus in the first halve of the cycle even if otherwise selected for read. The VIA (at least the variant from Western Design Center) already does that. With the 138 wired up as in my design, the RW and WR strobes are also idle in the first half of the cycle, so this should work with all parts that follow the standard RAM access protocol.

My clocking circuit generates this CLKFAST signal to have something that is approx 10ns early in respect to the cpu clock. This should about match the delay introduced by the CPU itself to tweak the performance a tiny bit. With this arrangement I can run the machine with about 14MHz quite reliably. Otherwise there is no real benefit in that.

c0pperdragon commented 2 years ago

Maybe there is one thing to mention about latching the high address bits: The latch used here is a transparent latch. That means it will change its output signals immediately upon change of the input signals, as long as the latch enable is active. Once the latch enable goes to inactive, the output stays as it was (maybe I have mixed up the active/inactive nomenclature here). So contrary to an edge-triggered flip-flop the data does not have to be prepared in the previous cycle but can be delivered relatively late during the cycle.

HeathenUK commented 2 years ago

Thanks for all the advice so far, it's been really helpful!

I've been radio silent on this for a few days as I've pondered how to balance the constraints here with the sorts of peripherals I think I'd like to work with (to be clear, I really enjoy that!)

Would you be willing to sense check where I think I've landed? Here's the schematic for the below:

Any insights really gratefully received as always.

Regards,

Greg

c0pperdragon commented 2 years ago

I really would not recommend to having overlapping address ranges. A programming error could cause a bus conflict and potential damage.

But with two MMUs, you can actually do much better: First MMU: Use CLKFAST/A23/A22 as enable signals and A21/A20/RWB as output selectors.
This gives you access to 4 R/W pairs mapped at 000000-0FFFFF, 100000-1FFFFF, 200000-2FFFFF, 300000-3FFFFF Second MMU: Use CLKFAST#/A23 as enable signals (the A23 connects to the active-high enable, and you need to take CLKFAST# from U5A). Use A22/21/RWB as output selectors This gives 4 R/W pairs mapped at 800000-9FFFFF, A00000-BFFFFF, C00000-DFFFFF, E00000-FFFFFF This leaves the area 400000-7FFFFF free so you can map in the VIA

HeathenUK commented 2 years ago

Hmm, thanks. Can I ask a couple of questions?

c0pperdragon commented 2 years ago

To clarify my proposal, I just explain what the E1 - E3 inputs of the MMU are actually doing: In order for any output to be active (low), all of the enable inputs need to be in their active state. That means E1,E2 low and E3 high. In my original OS816 i just use the active high input to enable the RD/WR signals only in the second half of the clock cycle. I could also have used one of the other enable inputs, by feeding in the inverted CLKFAST signal. But my measurements showed that by using just the CLKFAST all signals timings matched up better.

In my proposal I would feed A23,A23 to E1,E2 and CLKFAST to E3 (as original) of the frist MMU. The second MMU needs to be active on a high A23. As there is only one active-high enable input available I just swapped the polarity of the CLK as described above so I can use a low-active enable for this.

The address overlap in your schematic is for the range 100000-3FFFFF. Performing a read on these addresses would select both RAMRD#, as well as one of IORD#, AUDIORD, SPARERD.

c0pperdragon commented 2 years ago

Maybe I could propose another solution that uses only one MMU but a bit of additional decoding logic. This solution would give you 4 RD/WR pairs and also the possibility to map in some VIA-style chips. Additionally the memory addresses as used in the OS816 software would stay the same, so there is no need to modify it.

Starting from the original OS816 schematics, you can use some logic gates (preferable from a single chip) to create a single signal from A23,A22,A21. This signal can be high when the address does not match one of the ranges needed by the OS816 system. Use this signal as active high CS input to the VIA-type chips (more decode logic may be necessary, depending on the IC). Also use this signal as one of the active-low enable inputs for the MMU.

For example, using a single 7400 ICs, you can combine this to NOT ((A23 NAND A22) NAND A21) . If I am correct, this would create a high signal for the ranges 200000-3FFFFF, 600000-7FFFFF, A00000-BFFFFF.

HeathenUK commented 2 years ago

On your first response - I see! I thought there was some timing reason for using CLKFAST#, it didn't even occur to me that it was because you'd used up the active-high enables.

I think I more or less got to your reasoning in the end as I tabled out the implications for the two '138s...

image

Clearly there's room for flash etc. but the range will be completely different as you say.

Greg

HeathenUK commented 2 years ago

Maybe I could propose another solution that uses only one MMU but a bit of additional decoding logic. This solution would give you 4 RD/WR pairs and also the possibility to map in some VIA-style chips. Additionally the memory addresses as used in the OS816 software would stay the same, so there is no need to modify it.

Starting from the original OS816 schematics, you can use some logic gates (preferable from a single chip) to create a single signal from A23,A22,A21. This signal can be high when the address does not match one of the ranges needed by the OS816 system. Use this signal as active high CS input to the VIA-type chips (more decode logic may be necessary, depending on the IC). Also use this signal as one of the active-low enable inputs for the MMU.

For example, using a single 7400 ICs, you can combine this to NOT ((A23 NAND A22) NAND A21) . If I am correct, this would create a high signal for the ranges 200000-3FFFFF, 600000-7FFFFF, A00000-BFFFFF.

On this second response, wouldn't this mean that each of the VIA-type devices mapped in directly would have their CE go high when any of the others was accessed? Or is the suggestion that (since most VIA-type chips have two CE pins) you would also tie a unique address pin to the other CE pin?

HeathenUK commented 2 years ago

I'm actually really excited by your original proposal (dual "MMU")! I think I can work around the one limit it creates.

As an aside I've just been skimming the monitor/bios code to see how involved it's going to be if I end up needing to change ranges around.

I assume I need to leave RAM at 00:0000, and also assume it's necessary to leave FLASH at the other end, which in this new arrangement becomes E0:0000 to FF:FFFF (for relocated vector space and to ensure nothing else is active at that end of the address range). For the latter I assume I need to adjust the x_topaddress routines but probably not the linker parameters in build.bat. Have I got that right? I'm finding the assembler/linker/librarian docs a bit obtuse.

I guess the relocations -ABOOT=FFF000,7F000, -ARESET=FFFFF0,7FFF0 and -CFFF400,7F400 are fine since 512k is 512k wherever it's mapped. The main thing I think I need to change is for the monitor/bios itself to know where to write to or jump to I suppose, so hardcoded references to $C00000 or $C7F000 (and similar) need to change accordingly to $E00000 etc.

Regards,

Greg

c0pperdragon commented 2 years ago

Yes, it is necessary that RAM is visible at 00:0000. FLASH indeed needs to be at the high end because my simple circuit forces the bank address to FF during emulation mode after startup. You can change this by weakly pulling some address pins low instead of high, but you would need additonal resistors instead of my simple solution with the single resistor pack. The rest is open to rearrangement, but you may have to do changes to the code.

If you are re-arranging things anyway and use a second MMU, you could as well expand the addressing capabilties like this: First MMU like in my prosoal for address ranges 0x:xxxx, 1x:xxxx, 2x:xxx, 3x:xxxx Second MMU simimlar, but for the ranges Cx:xxxx, Dx:xxxx, Ex:xxxx, Fx:xxxx This completely leaves two big ranges (400000-7FFFFF and 80000-BFFFFF) that can be used for two VIA-style components. The only drawback is that you need to invert at least one of the address lines A23,A22 to use as enable inputs for the second MMU. If you need a 74HC04 anyway, you could as well invert both and feed the CLKFAST into E3 as usual.

I don't have the details on the linker commands in my head right now. You may need to experiment to get things right.

gadyke commented 2 years ago

I like that, the obsessive in me prefers the evenness of having 8 1Mbyte address ranges rather than 4 1Mbyte and 4 2Mbyte!

c0pperdragon commented 2 years ago

I just looked at the clock circuit in the OS816 schematic. Maybe you can just re-arrange the inverters a bit to free one up to invert A23. Together with the trick to use CLKFAST# instead of CLKFAST you could get away without the need for an additional 74HC04.

gadyke commented 2 years ago

Which do you suggest cutting out? U5E? I know U5F is needed to clean the oscillator. I know technically that would mean inverting the clock but it makes no real difference surely?

c0pperdragon commented 2 years ago

Yes, U5E should not be all too important for the clock signals shape.

gadyke commented 2 years ago

Great, this is looking more and more elegant a solution every moment!

gadyke commented 2 years ago

Almost a philosophical question from me on the consequence of cutting out one of the existing post-crystal inverters...

Strictly speaking that would mean CLK would now be CLK#, CLKFAST will be CLKFAST# and vice versa. I suppose in practice it makes absolutely no difference - a square wave is a square wave even when it's inverted.

Basically I was wondering if I need to do any rewiring of the various uses of those signals and came to the conclusion that it's irrelevant!

Any correction on that?

Greg

c0pperdragon commented 2 years ago

Yes, inserting/removing an inverter there makes no difference. I actually no not consider the signal from the crystal of being any specific phase until it is used for some useful logic. From this point on, of couse you need to name the things properly.

If couse you can wire up the various gates in the logic ICs as is most convenient for you.

I myself normally adjust the use of gates during the board layout to keep things as disentangled as possible.

gadyke commented 2 years ago

Great, I'll re-do the schema tonight before I get started on laying out a board.

G

HeathenUK commented 2 years ago

Right, this all looks good to me - would you be willing to skim and see if anything looks obviously wrong on the MMU side?

mainboard_MMU2.pdf

That schematic is based around this mapping out:

image

I think I'll source an 8mbit SRAM just because it can fit, 8mbit parallel NOR sadly rarer 🤣

G

c0pperdragon commented 2 years ago

Looks OK to me. With this setup you can use a RAM IC with up to 1MB, as you said. What kind of parts to you intend to use? As far as I know, there are no through-hole parts with this size available.

HeathenUK commented 2 years ago

Thanks and you're absolutely right - I'm happy enough soldering SMD components where I won't have cause to swap them on and off so I'm using a TSOP44 part.

Leaving the flash as PLCC though!

G

c0pperdragon commented 2 years ago

Maybe it is a little late now, but I actually found the perfect solution to add auxilary ICs to the OS816 architecture without modifying the original memory map, so there is no need to modify the BIOS:

The MMU2 has its E2/E3 connected to A23/A22 so it is active only for the free range 800000-BFFFFF The A0-A2 inputs of the MMU2 are connected to A19-A21. This gives you 8 chip-select (CS) outputs what will be active (low) for the 8 individual ranges 800000-87FFFF,880000-8FFFFF,...

Each CS can be used to either drive a VIA-style IC (together with system clock and RWB), or to drive a memory-style IC (together with the previously unused output pins 10 and 11 of MMU1 as its RD and WR lines)

HeathenUK commented 2 years ago

So in this schema MMU1 gives you (as before) 8080-style ranges with a RD# and WR# for each range, while MMU2 gives you 8 "traditional" CS# pins. Is that right?

c0pperdragon commented 2 years ago

Yes, that is right. You get 3 RW#/WR# ranges (4 MB each) and 8 ranges for the CS# pins. (512KB each). And the 4th R/W pair generates the access signals if they are needed for any IC that sits on one of the CS# ranges. Your sound chip for example would be connected to one CS line and to these R/W as well.

gadyke commented 2 years ago

Ok great, can I check that you actually meant E2/E3 as A22/A23 rather than A23/A22 for MMU2 if you're targeting the unusued 80:0000-BF:FFFF range (since E2 is the active low)?

Just to make sure I haven't missed something. I'll probably table it out again anyway later just to get the ranges noted down.

G

c0pperdragon commented 2 years ago

Yes, I mixed that up. E3 (active high) gets A23 and one of E2/E1 (active low) gets A22. The remaining is just GND.

gadyke commented 2 years ago

Ah good.

This seems perfect, and the most elegant solution so far - as you say it doesn't need any changes to the existing map (though agree with your thinking in Issue #3 that opportunities to make the ranges dynamic are sensible!)

G

HeathenUK commented 2 years ago

Just as a coda to this discussion and in case you're curious, this is the schematic I'm currently routing: mainboard_MMU2.pdf

"Dev flash" is just a socketed PLCC I'll use until I'm happy the software is stable, then I'll solder in the (much denser) TSOP.