Samsung / ONE

On-device Neural Engine
Other
427 stars 152 forks source link

Compiler FE: Support ubuntu 22.04 (jammy) #9432

Open ragmani opened 2 years ago

ragmani commented 2 years ago

What

Let's let ONE compiler support ubuntu 22.04.

Why

Ubuntu 22.04 has started to be release. The number of users using ubuntu 22.04 will gradually increase. So, let's prepare to support it in advance! It may a little bit early, but there is nothing wrong with preparing in advance.

Environment of ubuntu 22.04

default version

To do

Build Target Architectures

Build for x86_64

$ cd {one dir}
$ docker run -it --rm -v `pwd`:`pwd` -w `pwd` ubuntu:22.04 /bin/bash
apt update
apt install cmake libboost-all-dev g++ patch python3-pip python3-venv
python3 -m pip install --upgrade pip

./nncc configure
./nncc build
./nncc test

Build for arm32

$ sudo apt-get install qemu qemu-user-static binfmt-support debootstrap
$ cd {one dir}
$ ROOTFS_DIR=`pwd`/tools/cross/rootfs/arm-jammy sudo -E ./tools/cross/install_rootfs.sh arm jammy --skipunmount
$ cd {one dir}
$ docker run -it --rm -v `pwd`:`pwd` -w `pwd` ubuntu:22.04 /bin/bash
apt update
apt install cmake libboost-all-dev g++ patch python3-pip python3-venv
python3 -m pip install --upgrade pip

apt install g++-arm-linux-gnueabihf make
ROOTFS_ARM=`pwd`/tools/cross/rootfs/arm-jammy make -f infra/nncc/Makefile.arm32 cfg
ROOTFS_ARM=`pwd`/tools/cross/rootfs/arm-jammy make -f infra/nncc/Makefile.arm32 debug
ragmani commented 2 years ago

Problem of not finding python 3.10

https://github.com/Samsung/ONE/pull/9429#issuecomment-1184193971 This problem is not a trouble issue. This problem was limited to an individual environment.

ragmani commented 2 years ago

Problem caused by cmake policy change when using find_package(Boost ...)

.. versionadded:: 3.3

Support new if() IN_LIST operator.

CMake 3.3 adds support for the new IN_LIST operator.

The OLD behavior for this policy is to ignore the IN_LIST operator. The NEW behavior is to interpret the IN_LIST operator.

This policy was introduced in CMake version 3.3. CMake version 3.22.1 warns when the policy is not set and uses OLD behavior. Use the cmake_policy() command to set it to OLD or NEW explicitly.

.. note:: The OLD behavior of a policy is deprecated by definition and may be removed in a future version of CMake.


- solution
Add `cmake_policy(SET CMP0057 NEW)` in `macro(nnas_find_package PREFIX)`
But it requires `cmake_minimum_required(VERSION 3.3)`
hseok-oh commented 2 years ago

But it requires cmake_minimum_required(VERSION 3.3)

IMO, it's better to update cmake minimum requirement version because cmake 3.1 is old version (Dec 2014: https://cmake.org/pipermail/cmake/2014-December/059418.html).

hseok-oh commented 2 years ago

cmake version

ragmani commented 2 years ago

Problem caused by not finding version 2.6.0 of tensorflow-cpu (pip3.10)

seanshpark commented 2 years ago

using tensorflow-cpu 2.6.0 will be removed soon

--> #9433 , #9435

ragmani commented 2 years ago

Problem caused by gcc version grade to 11.

Internal sources

9437

External sources

ragmani commented 2 years ago

I found out an error that some onecc modules could not be found when cross-buliding onecc on my machine. The patch below solves this error.

@@ -20,38 +20,38 @@ ARM32_INSTALL_FOLDER=$(CURRENT_DIR)/$(BUILDFOLDER)/$(ARM32_FOLDER).$(TYPE_FOLDER
 ARM32_INSTALL_HOST=$(CURRENT_DIR)/$(BUILDFOLDER)/$(ARM32_FOLDER).$(TYPE_FOLDER).host.install

 # ARM32 build
-ARM32_BUILD_ITEMS:=angkor;cwrap;pepper-str;pepper-strcast;pp
-ARM32_BUILD_ITEMS+=;pepper-csv2vec;crew
-ARM32_BUILD_ITEMS+=;oops;pepper-assert
-ARM32_BUILD_ITEMS+=;hermes;hermes-std
-ARM32_BUILD_ITEMS+=;loco;locop;logo-core;logo
-ARM32_BUILD_ITEMS+=;safemain;mio-circle04;mio-tflite280
-ARM32_BUILD_ITEMS+=;dio-hdf5
-ARM32_BUILD_ITEMS+=;foder;circle-verify;souschef;arser;vconone
-ARM32_BUILD_ITEMS+=;luci
-ARM32_BUILD_ITEMS+=;luci-interpreter
-ARM32_BUILD_ITEMS+=;tflite2circle
-ARM32_BUILD_ITEMS+=;tflchef;circlechef
-ARM32_BUILD_ITEMS+=;circle2circle;record-minmax;circle-quantizer
-ARM32_BUILD_ITEMS+=;luci-eval-driver;luci-value-test
+ARM32_BUILD_ITEMS:=angkor;cwrap;pepper-str;pepper-strcast;pp;
+ARM32_BUILD_ITEMS+=;pepper-csv2vec;crew;
+ARM32_BUILD_ITEMS+=;oops;pepper-assert;
+ARM32_BUILD_ITEMS+=;hermes;hermes-std;
+ARM32_BUILD_ITEMS+=;loco;locop;logo-core;logo;
+ARM32_BUILD_ITEMS+=;safemain;mio-tflite280;mio-circle04;
+ARM32_BUILD_ITEMS+=;dio-hdf5;
+ARM32_BUILD_ITEMS+=;foder;circle-verify;souschef;arser;vconone;
+ARM32_BUILD_ITEMS+=;luci;
+ARM32_BUILD_ITEMS+=;luci-interpreter;
+ARM32_BUILD_ITEMS+=;tflite2circle;
+ARM32_BUILD_ITEMS+=;tflchef;circlechef;
+ARM32_BUILD_ITEMS+=;circle2circle;record-minmax;circle-quantizer;
+ARM32_BUILD_ITEMS+=;luci-eval-driver;luci-value-test;

 ARM32_TOOLCHAIN_FILE=cmake/buildtool/cross/toolchain_armv7l-linux.cmake

-ARM32_HOST_ITEMS:=angkor;cwrap;pepper-str;pepper-strcast;pp
-ARM32_HOST_ITEMS+=;pepper-csv2vec
-ARM32_HOST_ITEMS+=;oops
-ARM32_HOST_ITEMS+=;hermes;hermes-std
-ARM32_HOST_ITEMS+=;loco;locop;logo-core;logo
-ARM32_HOST_ITEMS+=;safemain;mio-circle04;mio-tflite280
-ARM32_HOST_ITEMS+=;foder;circle-verify;souschef;arser;vconone
-ARM32_HOST_ITEMS+=;luci
-ARM32_HOST_ITEMS+=;luci-interpreter
-ARM32_HOST_ITEMS+=;tflite2circle
-ARM32_HOST_ITEMS+=;tflchef;circlechef
-ARM32_HOST_ITEMS+=;circle-tensordump
-ARM32_HOST_ITEMS+=;circle2circle
-ARM32_HOST_ITEMS+=;common-artifacts
-ARM32_HOST_ITEMS+=;luci-eval-driver;luci-value-test
+ARM32_HOST_ITEMS:=angkor;cwrap;pepper-str;pepper-strcast;pp;
+ARM32_HOST_ITEMS+=;pepper-csv2vec;
+ARM32_HOST_ITEMS+=;oops;
+ARM32_HOST_ITEMS+=;hermes;hermes-std;
+ARM32_HOST_ITEMS+=;loco;locop;logo-core;logo;
+ARM32_HOST_ITEMS+=;safemain;mio-tflite280;mio-circle04;
+ARM32_HOST_ITEMS+=;foder;circle-verify;souschef;arser;vconone;
+ARM32_HOST_ITEMS+=;luci;
+ARM32_HOST_ITEMS+=;luci-interpreter;
+ARM32_HOST_ITEMS+=;tflite2circle;
+ARM32_HOST_ITEMS+=;tflchef;circlechef;
+ARM32_HOST_ITEMS+=;circle-tensordump;
+ARM32_HOST_ITEMS+=;circle2circle;
+ARM32_HOST_ITEMS+=;common-artifacts;
+ARM32_HOST_ITEMS+=;luci-eval-driver;luci-value-test;

 _SPACE_:=

But I'm not sure if this way is correct.

ragmani commented 2 years ago

I found an error when cross-building. It's hard for me to solve it.

// First, read/accumulate/write for src_ptr0 and src_ptr1.

define RUY_LOAD_ONE_ROW1(I, R) \

"cmp r2, #" #I "\n" \ "beq 5f\n" \ "vld1.8 { d0[" #R "]}, [%[src_ptr0]]!\n" \ "vld1.8 { d2[" #R "]}, [%[src_ptr1]]!\n" \

      RUY_LOAD_ONE_ROW1(0, 0)
      RUY_LOAD_ONE_ROW1(1, 1)
      RUY_LOAD_ONE_ROW1(2, 2)
      RUY_LOAD_ONE_ROW1(3, 3)
      RUY_LOAD_ONE_ROW1(4, 4)
      RUY_LOAD_ONE_ROW1(5, 5)
      RUY_LOAD_ONE_ROW1(6, 6)
      RUY_LOAD_ONE_ROW1(7, 7)

undef RUY_LOAD_ONE_ROW1

define RUY_LOAD_ONE_ROW2(I, R) \

"cmp r2, #" #I "\n" \ "beq 5f\n" \ "vld1.8 { d1[" #R "]}, [%[src_ptr0]]!\n" \ "vld1.8 { d3[" #R "]}, [%[src_ptr1]]!\n" \

      RUY_LOAD_ONE_ROW2(8, 0)
      RUY_LOAD_ONE_ROW2(9, 1)
      RUY_LOAD_ONE_ROW2(10, 2)
      RUY_LOAD_ONE_ROW2(11, 3)
      RUY_LOAD_ONE_ROW2(12, 4)
      RUY_LOAD_ONE_ROW2(13, 5)
      RUY_LOAD_ONE_ROW2(14, 6)
      RUY_LOAD_ONE_ROW2(15, 7)

undef RUY_LOAD_ONE_ROW2

      "5:\n"

      "veor.16 q4, q0, q11\n"
      "veor.16 q5, q1, q11\n"

      "vpaddl.s8 q8, q4\n"
      "vpaddl.s8 q9, q5\n"

      // Pairwise add accumulate to 4x32b accumulators.
      "vpadal.s16 q12, q8\n"
      "vpadal.s16 q13, q9\n"

      "vst1.32 {q4}, [%[packed_ptr]]!\n"
      "vst1.32 {q5}, [%[packed_ptr]]!\n"

      // Reset to src_zero for src_ptr2 and src_ptr3.
      "vdup.8 q0, r3\n"
      "vdup.8 q1, r3\n"

// Next, read/accumulate/write for src_ptr2 and src_ptr3.

define RUY_LOAD_ONE_ROW1(I, R) \

"cmp r2, #" #I "\n" \ "beq 5f\n" \ "vld1.8 { d0[" #R "]}, [%[src_ptr2]]!\n" \ "vld1.8 { d2[" #R "]}, [%[src_ptr3]]!\n" \

      RUY_LOAD_ONE_ROW1(0, 0)
      RUY_LOAD_ONE_ROW1(1, 1)
      RUY_LOAD_ONE_ROW1(2, 2)
      RUY_LOAD_ONE_ROW1(3, 3)
      RUY_LOAD_ONE_ROW1(4, 4)
      RUY_LOAD_ONE_ROW1(5, 5)
      RUY_LOAD_ONE_ROW1(6, 6)
      RUY_LOAD_ONE_ROW1(7, 7)

undef RUY_LOAD_ONE_ROW1

define RUY_LOAD_ONE_ROW2(I, R) \

"cmp r2, #" #I "\n" \ "beq 5f\n" \ "vld1.8 { d1[" #R "]}, [%[src_ptr2]]!\n" \ "vld1.8 { d3[" #R "]}, [%[src_ptr3]]!\n" \

      RUY_LOAD_ONE_ROW2(8, 0)
      RUY_LOAD_ONE_ROW2(9, 1)
      RUY_LOAD_ONE_ROW2(10, 2)
      RUY_LOAD_ONE_ROW2(11, 3)
      RUY_LOAD_ONE_ROW2(12, 4)
      RUY_LOAD_ONE_ROW2(13, 5)
      RUY_LOAD_ONE_ROW2(14, 6)
      RUY_LOAD_ONE_ROW2(15, 7)

undef RUY_LOAD_ONE_ROW2

      "5:\n"

      "veor.16 q4, q0, q11\n"
      "veor.16 q5, q1, q11\n"

      "vpaddl.s8 q8, q4\n"
      "vpaddl.s8 q9, q5\n"

      // Pairwise add accumulate to 4x32b accumulators.
      "vpadal.s16 q14, q8\n"
      "vpadal.s16 q15, q9\n"

      "vst1.32 {q4}, [%[packed_ptr]]!\n"
      "vst1.32 {q5}, [%[packed_ptr]]!\n"

      "4:\n"
      // Pairwise add 32-bit accumulators
      "vpadd.i32 d24, d24, d25\n"
      "vpadd.i32 d26, d26, d27\n"
      "vpadd.i32 d28, d28, d29\n"
      "vpadd.i32 d30, d30, d31\n"
      // Final 32-bit values per row
      "vpadd.i32 d25, d24, d26\n"
      "vpadd.i32 d27, d28, d30\n"

      "ldr r3, [%[params], #" RUY_STR(RUY_OFFSET_SUMS_PTR) "]\n"
      "cmp r3, #0\n"
      "beq 6f\n"
      "vst1.32 {d25}, [r3]!\n"
      "vst1.32 {d27}, [r3]!\n"
      "6:\n"
  // clang-format on

  : [ src_ptr0 ] "+r"(src_ptr0), [ src_ptr1 ] "+r"(src_ptr1),
    [ src_ptr2 ] "+r"(src_ptr2), [ src_ptr3 ] "+r"(src_ptr3)
  : [ src_inc0 ] "r"(src_inc0), [ src_inc1 ] "r"(src_inc1),
    [ src_inc2 ] "r"(src_inc2), [ src_inc3 ] "r"(src_inc3),
    [ packed_ptr ] "r"(packed_ptr), [ params ] "r"(&params)
  : "cc", "memory", "r1", "r2", "r3", "q0", "q1", "q2", "q3",
    "q4", "q5", "q6", "q7", "q8", "q9", "q10", "q11", "q12", "q13");

}

seanshpark commented 2 years ago

But I'm not sure if this way is correct.

@ragmani , I can't distinghish by the diff, what module has changed?

ragmani commented 2 years ago

@ragmani , I can't distinghish by the diff, what module has changed?

I haven't changed any modules. I just added ; at the end of each line.

chunseoklee commented 2 years ago

@ragmani , I can't distinghish by the diff, what module has changed?

I haven't changed any modules. I just added ; at the end of each line.

FYI, AFAIR, when trying to enable tizen build, the target without ';' is not added into build target.

ragmani commented 2 years ago

FYI, AFAIR, when trying to enable tizen build, the target without ';' is not added into build target.

Sorry. I didn't understand what you said. What are "target" and "build target" you mentioned?

hseok-oh commented 2 years ago

@jinevening You can find discussion about cmake version under https://github.com/Samsung/ONE/issues/9432#issuecomment-1184412692

ragmani commented 1 year ago

9916 is a way to fix the error commented at https://github.com/Samsung/ONE/issues/9432#issuecomment-1196362702