Closed keneoneth closed 11 months ago
Hi, yes you are correct that these precisions are assumed. Traditional accelerators will use 8 bit for the operands and a higher precision for the partial output sums. Here, 16 bit is assumed, but 24 bit is also used frequently.
If your accelerator uses larger precision, this will affect the cost estimation because the data fetches to/from the memories will be more expensive and not as many operands can be stored in lower memory levels. You can modify the operand_precision accordingly.
I can see from this line of code https://github.com/KULeuven-MICAS/zigzag/blob/54cf7c252e477aad99f46211c6363826552c08ff/zigzag/classes/io/onnx/conv.py#L119 that the precision of the I, W, and O operands are hard-coded to 8 during conv parsing. Also, from the code segment https://github.com/KULeuven-MICAS/zigzag/blob/54cf7c252e477aad99f46211c6363826552c08ff/zigzag/classes/io/onnx/conv.py#L164-L167, it seems the data type of the layer I/W/O operands are acquired but not used in later code. Is there an assumption behind this (e.g. the model is assumed to be of precision=8)? Would that affect the cost estimation? Any help will be well appreciated!