Caching for network shared memory

larsbrinkhoff commented 2 months ago

I jotted down a diagram for states I imagine a cache for network shared memory could have. This could be per-region or even on a single word basis.

Screenshot from 2024-04-12 15-16-34

larsbrinkhoff commented 2 months ago

Going from an "inaccessible" memory location has a acquire state which waits for confirmation from the other side(s). If both try to access the same location at the same time, there would have to be some kind of retry.

Once confirmed, a read-only location would be in a shared state between all processors. A write (and read) location is exclusive to a single processor that "owns" it.

A read-only location could be changed to a write location, ether locally going through the acquire stare, or remotely which then is inaccessible locally. A locally exclusively read/write location is changed to the shared read state if another processor wants to read it.

larsbrinkhoff commented 2 months ago

The basic idea is that a memory location (word or region) is either shared read-only between all processors, or exclusive read/write for one processor and inaccessible to the rest. "Inaccessible" means memory data is unknown. When you get an ACK out of the acquire state the remote side sends its data. The only slow thing is if both sides are writing the same place, but there's not much to do about that.

philbudne commented 2 months ago

What is the application? The MIT-AI KA/6 pairing? Just from reading the issue, here are thoughts that sprang to mind...

Since neither processor had cache for software to be aware of, I'd be tempted to have all processors reference the shared memory (mappings of a file) using C "volatile", pay the penalty of repeated accesses rather than the compiler optimizing memory access, and let the host hardware deal with cache coherency, at least for the one moby of memory the '6 could possibly access. My recall is that PDP-10 memory bus access cycles could be read, write, or atomic read-modify-write, so there would need to be a memory cycle lock...

Since the word "network" is in the title: Are you thinking of having CPUs on different nodes of a network? In that case, I wonder if you could access the shared memory using a "memory cable" protocol? Each simulated CPU could have a per-moby, or per fixed size "memory box" mapping table (mapping high order physical address bits to a memory access object, which might be trivial, local memory, or a connection to a shared memory server)

larsbrinkhoff commented 2 months ago

Yes, the application would mainly replicate MIT's KA10 + PDP-6. But there's a similar situation between the KA10 and the PDP-11 front ends. We already have a TCP-based protocol for this (a "memory cable), but it's somewhat slow. I have experimented with using shared memory on the host to implement the 10-11 interface, and it would be applicable to the PDP-6 as well.

This issue is specifically about shared memory using a network transport. One that is smarter about caching memory.

larsbrinkhoff commented 2 months ago

@aap raised the valid point that this scheme is general and a bit complex. For the PDP-6, the common use case is for the KA10 to store a program, and then the PDP-6 will take over and run it.

To that I would like to counter that some PDP-6 programs do communicate back and forth with the KA10. We still haven't explored much in this area, so we don't quite know which programs do what, how much they talk between the processors, and what the communication style is.

And also, I have in mind the 10-11 interface were we do know that there are shared memory areas used for communication. The style is usually to have an area that is written by one processor and read-only to the other. (For this case, maybe it would be better to send the data over the network rather than make it inaccessible to the read-only side.)

aap / pdp6

Caching for network shared memory #30