tum-ei-eda / mlonmcu

Tool for the deployment and analysis of TinyML applications on TFLM and MicroTVM backends
Apache License 2.0
29 stars 12 forks source link

[TVM] Get tir.round support upstreamed #2

Closed PhilippvK closed 2 years ago

PhilippvK commented 2 years ago

Currently we need to use our own fork or patch the TVM codebase to get some of our models running:

diff --git a/src/target/source/codegen_c.cc b/src/target/source/codegen_c.cc
index a31111153..bf43aabd3 100644
--- a/src/target/source/codegen_c.cc
+++ b/src/target/source/codegen_c.cc
@@ -675,6 +675,10 @@ void CodeGenC::VisitExpr_(const CallNode* op, std::ostream& os) {  // NOLINT(*)
       const StringImmNode* str = op->args[0].as<StringImmNode>();
       ICHECK(str != nullptr);
       os << "__tvm_param__" << str->value;
+    } else if (ptr_op->name == "tir.round") {
+      os << "(";
+      this->PrintExpr(op->args[0], os);
+      os << " + 0.5f)";
     } else {
       LOG(FATAL) << "Unresolved call " << op->op;
     }

I would like to use the upstream TVM repository in the default environment and use our fork in a new environment called tumeda.

@rafzi Is there an issue why this small patch did not yet make it upstream?

PhilippvK commented 2 years ago

Error message during resnet inference on --target host_x86:

/usr/bin/ld: default_lib1.c:(.text.tvmgen_default_fused_nn_softmax_divide_add_clip_round_cast+0xeb): undefined reference to `roundf'
PhilippvK commented 2 years ago

I realized that the above error also happens with a patched TVM and also for the expf but only on non-ETISS targets. I was able to fix them by linking with -lm, however I was wondering if this leads to some unwanted overhead and why this did not happen before?

rafzi commented 2 years ago

I didn't think that this would be the intended fix by the TVM community.

Appearently on other targets, this is translated to a proper CRT API call before reaching the patched code? The implementation is also kind of hacky as it is not exactly equivalent to a roundf.

So I'd suggest to test this again, since it has been a long time since I tried without this patch. Then we can open a topic on the mailing list to figure out where the fix must be implemented.

A rough sketch for the post:


When using quantized TFLM models with the C backend, we noticed an assertion Unresolved call to tir.round.

A quick fix we implemented is the following:

...

We are unsure whether this is the right way to fix this issue, because the code is not exactly equivalent to roundf and on some architectures with the same model, the error is not encountered at all - presumably because the tir.round call was resolved by an earlier pass.

Do you have any idea where something could be missing to cause this error? We'd be happy to provide a PR with the proper fix.

PhilippvK commented 2 years ago

I ran a quick session to investigate where the problem:

Model Backend Target Failed?
sine_model tvmaot etiss_pulpino no
sine_model tvmaot spike no
sine_model tvmaot host_x86 no
cifar10 tvmaot etiss_pulpino no
cifar10 tvmaot spike no
cifar10 tvmaot host_x86 no
simple_mnist tvmaot etiss_pulpino no
simple_mnist tvmaot spike yes
simple_mnist tvmaot host_x86 yes
micro_speech tvmaot etiss_pulpino no
micro_speech tvmaot spike yes
micro_speech tvmaot host_x86 yes
magic_wand tvmaot etiss_pulpino no
magic_wand tvmaot spike no
magic_wand tvmaot host_x86 no
aww tvmaot etiss_pulpino no
aww tvmaot spike no
aww tvmaot host_x86 no
vww tvmaot etiss_pulpino no
vww tvmaot spike no
vww tvmaot host_x86 no
resnet tvmaot etiss_pulpino no
resnet tvmaot spike no
resnet tvmaot host_x86 no
toycar tvmaot etiss_pulpino no
toycar tvmaot spike no
toycar tvmaot host_x86 no

Obviously only floating point models are affected and very target except ETISS/Pulpino. The error message is the following:

[100%] Linking C executable ../bin/generic_mlif
/tmp/mlonmcu_env/deps/install/riscv_gcc/bin/../lib/gcc/riscv32-unknown-elf/11.1.0/../../../../riscv32-unknown-elf/bin/ld: ../lib/ml_interface/libtvm_extension.a(default_lib1.c.obj): in function `.L15':
default_lib1.c:(.text.tvmgen_default_fused_nn_softmax_divide_add_clip_round_cast+0x46): undefined reference to `expf'
/tmp/mlonmcu_env/deps/install/riscv_gcc/bin/../lib/gcc/riscv32-unknown-elf/11.1.0/../../../../riscv32-unknown-elf/bin/ld: ../lib/ml_interface/libtvm_extension.a(default_lib1.c.obj): in function `.L19':
default_lib1.c:(.text.tvmgen_default_fused_nn_softmax_divide_add_clip_round_cast+0xb2): undefined reference to `roundf'
collect2: error: ld returned 1 exit status
PhilippvK commented 2 years ago

Easy fix, as already applied in https://github.com/tum-ei-eda/mlonmcu-sw/commit/99e54ab33b525c89e8ece059ebbe288e188ffdcc is to link the math library. Wow.

@rafzi I think we can live with that, right? However I am still wondering why ETISS/Pulpino still works without explicitly linking it..

PhilippvK commented 2 years ago

So the patch should not be required anymore. There is another other rather old one here:

https://github.com/tum-ei-eda/utvm_staticrt_codegen/blob/master/tvm_patches/0001-rtfix.patch

Can you tell if we still need this?

rafzi commented 2 years ago

I assume this was fixed upstream because it was such a prominent error. So it seems like we are safe to drop the patch. Thanks for testing!

This seems to be not required by ETISS because the target software always contains C++ code through syscalls.cpp and libstdc++ depends on libm: https://stackoverflow.com/questions/1033898/why-do-you-have-to-link-the-math-library-in-c