Did you know that Pilkington - a glass manufacturer - once designed an FPGA? As you might have imagined, it wasn't very successful. One source suggests the fabs couldn't get the FPGA to work (for some unspecified definition of "work"), or perhaps that the Pilkington toolchain was not as mature as it could have been for a commercial offering.
But instead of saying "stick to your knitting", I tracked down a white paper and the architecture patent. These two conflict on some pieces of information, so I'll reference the patent whenever that happens. Let's take a look, shall we?
A Brief Description Of The Pilkington Architectures
Pilkington has two (known) architectures, both based on a sea-of-gates design, where a logic function is built up by linking gates together. This comes from the mid-80's where different manufacturers had very different logic cell structures - only Xilinx had the lookup tables we're now familiar with.
The first architecture (I don't know of any particular name for it) was designed around a logic cell with a NAND gate and a latch, with each logic cell feeding into its adjacent neighbours through local interconnect. The white paper suggests that this was inefficient due to needing a lot of NAND gates for common functions like OR and XOR, as well as placement restrictions caused by needing to build a D flip-flop out of latches. I'm not going to go into too much more detail about it though, as it's not the focus of this article; if you want to read more, I think its architecture patent or the Toshiba paper.
The second architecture (the white paper calls it TS1, so TS1 it is) made me laugh when I looked at the diagrams of the patent. Logic optimisation programs like ABC represent logic as a structure of AND gates, XOR gates (which are common, but difficult to represent as ANDs), multiplexers (which are special cases that want to be preserved), and D flip-flops (as synchronous elements); all with per-input programmable inversion. Pilkington has made an implementation of that in hardware in the mid-90's when logic optimisers of the time were still using unwieldy sum-of-products form. It's quite ahead of its time, honestly.
TS1 Logic Cell
The logic cell really is simplicity itself: the combinational logic element just sources the inputs from input selector muxes, optionally inverts them, feeds them to the inputs of a NAND, XOR and MUX, selects an output from them, and then inverts it to amplify the signal.
The sequential logic element is along the same lines, except that the XOR is gone, and the MUX feeds into a DFF instead.
The output of a logic cell is directly connected to "local interconnect": fast links to the A and B input selectors of adjacent signals (patent Figure 2.4.B); it may also connect to the "medium interconnect": slower horizontal and vertical interconnect links - 6 per row/column - spread through a "zone" of logic cells.
TS1 Routing
Logic cells are grouped into a square of 3 combinational logic elements and a sequential logic element (Figure 2.5.B). These tiles come in two variants, A (Figure 2.4.C) and B (Figure 2.4.D), differing only in how they connect to the medium interconnect, and A and B are tiled together to form a zone of 5x5 tiles (Figure 2.4.E).
Each zone is surrounded by port cells that interface with "global interconnect": 4 lines along each row and column of zones to connect them together. This is - by modern standards - not very much global interconnect, so the placement tool has to make the most of the significantly more generous local routing per zone. The designers have another trick up their sleeve: if you make the logic gates fast enough, you can route logic through the gates without too much performance loss; this is why there are no direct links between the rows and columns of medium interconnect - those links are the logic gates.
Thoughts
This design feels really elegant to me, compared to some of the stranger logic families of the time (such as Actel's multiplexer architecture). The gates chosen are small and simple and easy for software to handle: synthesis isn't really a problem. Splitting the logic into zones was explicitly intended to make place-and-route easier by turning it into a divide and conquer problem, but perhaps modern algorithms would be fine treating it as a global problem (I can't see placement here being significantly tricker than on, say, an ECP5 85k).
So, perhaps, the implementation of the architecture had bugs that couldn't be fixed, or the 90's tooling was unable to exploit the hierarchy of the chip well enough. I think we should perhaps reconsider this architecture in light of the modern open tooling we have, both ASIC and FPGA. I think it's a lot more feasible to make the most of this nowadays than it was in the mid-90's when you had to roll your own toolchain.
Perhaps the actual moral should be "good ideas are timeless". This one certainly feels that way.
A Constructive Look At Pilkington's TS1 FPGA
Did you know that Pilkington - a glass manufacturer - once designed an FPGA? As you might have imagined, it wasn't very successful. One source suggests the fabs couldn't get the FPGA to work (for some unspecified definition of "work"), or perhaps that the Pilkington toolchain was not as mature as it could have been for a commercial offering.
But instead of saying "stick to your knitting", I tracked down a white paper and the architecture patent. These two conflict on some pieces of information, so I'll reference the patent whenever that happens. Let's take a look, shall we?
A Brief Description Of The Pilkington Architectures
Pilkington has two (known) architectures, both based on a sea-of-gates design, where a logic function is built up by linking gates together. This comes from the mid-80's where different manufacturers had very different logic cell structures - only Xilinx had the lookup tables we're now familiar with.
The first architecture (I don't know of any particular name for it) was designed around a logic cell with a NAND gate and a latch, with each logic cell feeding into its adjacent neighbours through local interconnect. The white paper suggests that this was inefficient due to needing a lot of NAND gates for common functions like OR and XOR, as well as placement restrictions caused by needing to build a D flip-flop out of latches. I'm not going to go into too much more detail about it though, as it's not the focus of this article; if you want to read more, I think its architecture patent or the Toshiba paper.
The second architecture (the white paper calls it TS1, so TS1 it is) made me laugh when I looked at the diagrams of the patent. Logic optimisation programs like ABC represent logic as a structure of AND gates, XOR gates (which are common, but difficult to represent as ANDs), multiplexers (which are special cases that want to be preserved), and D flip-flops (as synchronous elements); all with per-input programmable inversion. Pilkington has made an implementation of that in hardware in the mid-90's when logic optimisers of the time were still using unwieldy sum-of-products form. It's quite ahead of its time, honestly.
TS1 Logic Cell
The logic cell really is simplicity itself: the combinational logic element just sources the inputs from input selector muxes, optionally inverts them, feeds them to the inputs of a NAND, XOR and MUX, selects an output from them, and then inverts it to amplify the signal.
The sequential logic element is along the same lines, except that the XOR is gone, and the MUX feeds into a DFF instead.
The output of a logic cell is directly connected to "local interconnect": fast links to the A and B input selectors of adjacent signals (patent Figure 2.4.B); it may also connect to the "medium interconnect": slower horizontal and vertical interconnect links - 6 per row/column - spread through a "zone" of logic cells.
TS1 Routing
Logic cells are grouped into a square of 3 combinational logic elements and a sequential logic element (Figure 2.5.B). These tiles come in two variants, A (Figure 2.4.C) and B (Figure 2.4.D), differing only in how they connect to the medium interconnect, and A and B are tiled together to form a zone of 5x5 tiles (Figure 2.4.E).
Each zone is surrounded by port cells that interface with "global interconnect": 4 lines along each row and column of zones to connect them together. This is - by modern standards - not very much global interconnect, so the placement tool has to make the most of the significantly more generous local routing per zone. The designers have another trick up their sleeve: if you make the logic gates fast enough, you can route logic through the gates without too much performance loss; this is why there are no direct links between the rows and columns of medium interconnect - those links are the logic gates.
Thoughts
This design feels really elegant to me, compared to some of the stranger logic families of the time (such as Actel's multiplexer architecture). The gates chosen are small and simple and easy for software to handle: synthesis isn't really a problem. Splitting the logic into zones was explicitly intended to make place-and-route easier by turning it into a divide and conquer problem, but perhaps modern algorithms would be fine treating it as a global problem (I can't see placement here being significantly tricker than on, say, an ECP5 85k).
So, perhaps, the implementation of the architecture had bugs that couldn't be fixed, or the 90's tooling was unable to exploit the hierarchy of the chip well enough. I think we should perhaps reconsider this architecture in light of the modern open tooling we have, both ASIC and FPGA. I think it's a lot more feasible to make the most of this nowadays than it was in the mid-90's when you had to roll your own toolchain.
Perhaps the actual moral should be "good ideas are timeless". This one certainly feels that way.
If you liked this article, why not support me on Patreon so I can eat?