This change makes it possible to parallelize data moves and matrix multiplies by moving all host IO to the secondary memory port of the local cache. We introduce a host router to handle multiplexing the dual DRAM ports into that memory port, and remove host IO ports from the existing router. This has the incidental benefit of adding support for the memory=>DRAM1 data move operation.
This change makes it possible to parallelize data moves and matrix multiplies by moving all host IO to the secondary memory port of the local cache. We introduce a host router to handle multiplexing the dual DRAM ports into that memory port, and remove host IO ports from the existing router. This has the incidental benefit of adding support for the memory=>DRAM1 data move operation.