microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
MIT License
953 stars 158 forks source link

[BUG] Antares IR translate functions affect building graph with generic_operators when kernel_tuning disabled #295

Open xysmlx opened 3 years ago

xysmlx commented 3 years ago

🐛 Bug

Antares IR translate functions affect building graph with generic_operators even though current compilation does not enable kernel tuning.

For example, condition check in translate_v2 of the GatherV2 op will block building graph when compiling the control-flow benchmark model.

[INFO] 2021-07-20T06:28:29z src/nnfusion/frontend/onnx_import/util/graph_convert.cpp 413        convert node: Gather:inputs_0-@tmp_54=>inp_2
[ERROR] 2021-07-20T06:28:29z src/nnfusion/util/errors.hpp 169   Check failed: 'ng_op->is_constant()' at src/nnfusion/core/operators/generic_op/generic_op_define/GatherV2.cpp:89:
The GatherV2 scalar mode only support "indices" as Constant
terminate called after throwing an instance of 'nnfusion::errors::CheckError'
  what():  Check failed: 'ng_op->is_constant()' at src/nnfusion/core/operators/generic_op/generic_op_define/GatherV2.cpp:89:
The GatherV2 scalar mode only support "indices" as Constant
Aborted (core dumped)

The GNN model meets the same problem: the translate_v2 function of the Convolution op does not support Conv1D, leading to check failure in building graph.

This problem is due to constructing m_expression in the GenericOp::validate_and_infer_types function.

if (localOpConfig.f_translate_v2 != nullptr && !m_expression.size())
{
    m_expression = localOpConfig.f_translate_v2(gnode);
}

if (localOpConfig.f_translate != nullptr && !m_expression.size())
{
    m_expression = localOpConfig.f_translate(gnode);
}

Expected behavior

Execute Antares IR translate functions only when enabling kernel tuning.

Additional context

I think #190 could be a solution.

nnfbot commented 3 years ago

Thanks for the report @xysmlx! I will look into it ASAP! (I'm a bot).

mzmssg commented 3 years ago

It might be the same bug as #290