Open pgkeller opened 2 months ago
Something to consider, code such as this: uint32_t CreateSemaphore( Program &program, const std::variant<CoreRange, CoreRangeSet> &core_spec, uint32_t initial_value, CoreType core_type) { with a CoreCoord containing an enum-type can mix and match types through the API. This may be ok for some APIs and not for others, we don't want to introduce a high burden of API validation
we currently denote dram channels as 0..n and then map to a single CoreCoord, but we may want to uplift this to expose multiple coordinates for a given dram channel
this might be a tricky thing to test out, but I think in some cases the virtual coord might not work for dram sharded matmul (or other dram read critical ops). If we use virtual coords, then we will always have the mapping in the image. If the harvested rows are row 5 and 7, then the mapped cores (W2, W3, W10, W11, W8, W7) will be shifted down (second image). Then we will have multiple cores in one column, while dram bank is not next to it, and causing congestions.
@yugaoTT ok, took me a bit to process that, but I see the problem. And so you are saying that to get optimal placement in this scenario the host needs to understand the physical coordinates?
I had a thought last night, here's a proposal: 1) Use strongly typed coordinates for the API, eg, EthCoreCoord, TensixCoreCoord, etc (names up for debate) 2) APIs are templated for the different coordinate types 3) Internally set an enum in the base CoreCoord w/ the type info (either at creation time or when the API is called) 4) Internally use generic coordinates w/ type info
@pgkeller yeah that's right.
@pgkeller yeah that's right.
I'm thinking we expose physical coords behind a "detail::" API. In other words, it isn't a first class API but provided as a semi-supported way of getting better perf, particularly since we don't think it'll be needed for BH and beyond
I had a thought last night, here's a proposal:
- Use strongly typed coordinates for the API, eg, EthCoreCoord, TensixCoreCoord, etc (names up for debate)
- APIs are templated for the different coordinate types
- Internally set an enum in the base CoreCoord w/ the type info (either at creation time or when the API is called)
- Internally use generic coordinates w/ type info
This makes sense to me, it doesn't seem like we would want a logical view of heterogeneous core types. I don't think we would support buffers being interleaved across different core types and wouldn't create kernels across different core types in one api
Change CoreCoord logical/physical as follows:
Pending: performance testing w/ physical/logical coords for kernels highly dependent on DRAM traffic
See also #5783
@abhullar-tt @aliuTT @tt-asaigal @yan-zaretskiy @yugaoTT @davorchap @pgkeller