We want a testing harness in python, we want to be able to write:
import torch
from ttmlir.dialects import ttir
# What the test harness could look like
class Add:
def golden(self, a, b):
return a + b
def build(self, a, b):
return ttir.add(a, b)
class TTIRBuilder:
def create_ttir_tensor(...):
def create_ttir_add(...):
# What a test definition could look like
def test_add(builder):
torch.seed(0)
in0 = torch.randn(64, 128)
torch.seed(1)
in1 = torch.randn(64, 128)
golden = in0 + in1
ttir_tensor0 = builder.create_ttir_tensor(in0)
ttir_tensor1 = builder.create_ttir_tensor(in1)
out = builder.create_ttir_add(ttir_tensor0, ttir_tensor1)
builder.finish(input_seeds=[0, 1], golden_outputs=[golden])
This would build an MLIR graph in TTIR, then lower it to ttmetal dialect and then serialize to flatbuffer. It will also embed the golden information directly in the flatbuffer.
TTRT will then be able to pop open the embedded golden info, regen the same inputs using the same seed, and compare the embedded golden output with the run from the device. Sync with Taps regarding this golden support in TTRT which doesn't exist yet.
Reference:
test/python/tensor_layout.py this test already demonstrates creating MLIR from python
test/python/simple_kernel.py super prototype of writing a kernel in python and translating it to mlir, this test is fairly deprecated at this point and we'll almost certainly want to remove/change it. But it does serve as a good reference building an MLIR graph in python
In the short term we should just bake all golden information directly into the flatbuffer in the debug_info.fbs area. We can have a contract with ttrt that knows to look for golden info in that area and automatically does golden comparison / populates input data from there if exists.
debug_info.fbs:
table GoldenTensorDataBytes {
data: [uint8];
};
table GoldenTensorDataSeed {
seed: uint64;
};
table GoldenTensorDataURL {
url: string; // local path or URL
};
union GoldenTensorData {
GoldenTensorDataBytes,
GoldenTensorDataSeed, // placeholder for future, we can store just the seed for random generated inputs instead of storing the full tensor data inline
GoldenTensorDataURL, // placeholder for future, where we want to point to real weights / inputs
};
table GoldenTensor {
ref: TensorRef; // Reference to tensor in the program that this golden corresponds to, shape info can be inferred from here too
data: GoldenTensorData;
};
table GoldenInfo {
golden_tensors: [GoldenTensor];
};
table DebugInfo {
...
golden_info: GoldenInfo;
}
We want a testing harness in python, we want to be able to write:
This would build an MLIR graph in TTIR, then lower it to ttmetal dialect and then serialize to flatbuffer. It will also embed the golden information directly in the flatbuffer.
TTRT will then be able to pop open the embedded golden info, regen the same inputs using the same seed, and compare the embedded golden output with the run from the device. Sync with Taps regarding this golden support in TTRT which doesn't exist yet.
Reference:
test/python/tensor_layout.py
this test already demonstrates creating MLIR from pythontest/python/simple_kernel.py
super prototype of writing a kernel in python and translating it to mlir, this test is fairly deprecated at this point and we'll almost certainly want to remove/change it. But it does serve as a good reference building an MLIR graph in pythonIn the short term we should just bake all golden information directly into the flatbuffer in the
debug_info.fbs
area. We can have a contract with ttrt that knows to look for golden info in that area and automatically does golden comparison / populates input data from there if exists.debug_info.fbs
: