Open asfiyab-nvidia opened 6 months ago
Thanks for reporting. I think https://github.com/microsoft/onnxscript/pull/1484 is related. Do you know if the onnxscript version you have is the latest?
Thanks for linking the related issue. I tried with the latest onnxscript==0.1.0.dev20240515 version and I see the changes in the linked PR are in the 20240515 nightly version. However, I'm still seeing the same error
cc @gramalingam
Hi @asfiyab-nvidia : can you attached the (unoptimized) onnx model here? That would be helpful. I believe that if the optimizer failes, it will still build an unoptimized onnx model. Thanks!
@justinchuby : while waiting for the model to repro, I wonder where "dtype((numpy.uint16, [('bfloat16', '<u2')]))" comes from ... it seems like ml_dtypes is a possible source for this? Even so, it doesn't add up ... I think ml_dtypes is used in the IR, right? But the constant-folding optimizer doesn't yet use the new IR ... oh, well, I guess I should try it out with the actual model.
Your are right that ml_dtypes doesn't kick in at this stage yet. It looks like a product from the reference evaluator (most likely due to a cast node). I suggest we use ml_dtypes in the reference evaluator (and across ONNX) as well.
You are right. The reference implementation does introduce this. That raises another question (which, I guess, is what motivates the second part of your answer): what bfloat16 encoding does the reference evaluator use? Is that a custom one that is conceptually a duplicate of the ml_dtypes one? I agree that it would be good to use a uniform encoding across all onnx tools/implementations.
The custom types for the ref evaluator are defined here: https://github.com/onnx/onnx/blob/88f8ef15cfaa3138d336f3502aed5018d802bf43/onnx/reference/custom_element_types.py#L8. They are simply byte representation that does not support any arithmetic operations.
With ml_dtypes computation will be supported, besides having the correct byte representation.
This should be addressed by https://github.com/onnx/onnx/pull/6170.
Hi, I'm attempting to export the tiiuae/falcon-rw-1b model using Dynamo. I'm using the script below but run into an issue related to bfloat16 type. Is this a known issue?
The ONNX export seems to be successful and the failure occurs during the optimization step. Below is the error stack: