onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure
Apache License 2.0
748 stars 316 forks source link

Broken test for latest master #408

Open doru1004 opened 3 years ago

doru1004 commented 3 years ago

Latest master breaks the following automatic test:

https://github.com/onnx/onnx-mlir/runs/1418810339

The rest of the tests pass. Any thoughts @tungld @AlexandreEichenberger ?

doru1004 commented 3 years ago

More failures in s390 test:

FAIL: Open Neural Network Frontend :: onnx/onnx_lowering.mlir (32 of 33)
FAIL: Open Neural Network Frontend :: onnx/onnx_shape_inference.mlir (33 of 33)
********************
Failed Tests (2):
  Open Neural Network Frontend :: onnx/onnx_lowering.mlir
  Open Neural Network Frontend :: onnx/onnx_shape_inference.mlir

Testing Time: 0.57s
  Unsupported:  1
  Passed     : 30
  Failed     :  2
AlexandreEichenberger commented 3 years ago

I don't get these failures on my Mac. How can I try to reproduce this? @gongsu832 @caoimhinuibrian ?

tungld commented 3 years ago

https://github.com/onnx/onnx-mlir/runs/1418810339

I saw this in the raw logs: 2020-11-18T15:05:16.1700780Z -- Failed to find LLVM FileCheck.

It failed to find pybind11 also:


2020-11-18T15:21:56.9445540Z   (requested version 2.2) with any of the following names:
2020-11-18T15:21:56.9445910Z 
2020-11-18T15:21:56.9449710Z     pybind11Config.cmake
2020-11-18T15:21:56.9450990Z     pybind11-config.cmake
2020-11-18T15:21:56.9451340Z 
2020-11-18T15:21:56.9451840Z   Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set
2020-11-18T15:21:56.9452550Z   "pybind11_DIR" to a directory containing one of the above files.  If
2020-11-18T15:21:56.9453290Z   "pybind11" provides a separate development package or SDK, be sure it has
2020-11-18T15:21:56.9454390Z   been installed.```
AlexandreEichenberger commented 3 years ago

I did find out that s390 has a different order for operations that have no dependence, e.g.

 %c0 = constant 0
 %c1 = constant 1

So I will add CHECK-DAG directives to better handle this. Not sure why z behave differently than x86.

AlexandreEichenberger commented 3 years ago

I added the CHECK-DAG to the latest addition to the lit tests, hoping its enough. If not, I will need to add the DAG to all of them

tungld commented 3 years ago

@tjingrant and I have seen the problem of different order since we enabled lit tests for the first time. It randomly happens when we does not explicitly create a variable for a constant, like

Value cst = emitConstantFor(...) Value add = rewriter.create(loc, X, cst1)

But Value add = rewriter.create(loc, X, emitConstantFor(...))

Not sure it is the case here or not.