OsamaRyees commented 1 year ago

привет, я не могу поверить, что я нашел вашу страницу. я пытаюсь скомпилировать tengine с 6.4.8.7.1.1.1-1. Я следовал вашему руководству, но у меня возникла проблема

Не могли бы вы предоставить более подробную информацию о том, что такое журнал git для версий, которые вы использовали для этих двух репозиториев?

cd ~/tengine-lite; git log | tail cd ~/TIM-VX; git log | tail

TIM-VX обновляется ежедневно

Я использовал следующее khadas изображение: https://dl.khadas.com/products/vim3/firmware/ubuntu/generic/vim3-ubuntu-20.04-gnome-linux-4.9-fenix-1.4-221229.img.xz

$cat /etc/fenix-release BOARD=VIM3 VENDOR=Amlogic VERSION=1.4 ARCH=arm64 INITRD_ARCH=arm64 INSTALL_TYPE=EMMC IMAGE_VERSION=1.4-221229

$dpkg -s aml-npu Version: 6.4.8.7.1.1.1-1 $ sudo dmesg | grep "Galcore" [ 19.790422] Galcore version 6.4.8.7.1.1.1

я распаковал этот файл wget -c https://github.com/VeriSilicon/TIM-VX/releases/download/v1.1.34.fix/aarch64_A311D_6.4.8.tgz

но если вы проверите файлы .so, которые TIM-VX предоставляет как 6.4.8, на самом деле это 6.4.6 !!!

$strings ~/aarch64_A311D_6.4.8/lib/libGAL.so | grep "6.4.6" VERSION$6.4.6:345497

это приводит к сбою запуска скомпилированных примеров.

~/tengine-lite/build; ./examples/tm_yolov3_timvx -i ~/park2.jpg -m ~/ai_models/yolov3_uint8.tmfile tengine-lite library version: 1.5-dev [ 1] HAL user version: 6.4.6.345497 [ 2] HAL kernel version: 6.4.8.415784

Я очень ценю, если вы предоставили то, что сработало для вас, поскольку я борюсь с этим уже несколько недель. Спасибо

OptimusFaber commented 1 year ago

Привет, у меня вроде не было сбоев при установке по моему туториалу... У тебя же Vim3 Pro? И ещё, я брал свой образ отсюда: https://dl.khadas.com/products/vim3/firmware/ubuntu/emmc/ Попробуй мой вариант, тем более он ставится и удобнее и быстрее

OsamaRyees commented 1 year ago

да, у меня есть Vim3 Pro. возможно я использую другую версию прошивки Khadas

не могли бы вы показать вывод

cat /etc/fenix-release cd ~/tengine-lite; git log | tail cd ~/TIM-VX; git log | tail strings /usr/lib/libGAL.so | grep "VERSION" cd ~/tengine-lite/build; ls ./examples/*_timvx

OsamaRyees commented 1 year ago

неважно, я начинаю с нового образа ОС 1.1.2

$cat /etc/fenix-release IMAGE_VERSION=1.1.2-220930

затем я установил как ваше руководство, но для TIM-VX я должен сделать: git checkout v1.1.34.fix

для tengine-lite я только что использовал последнюю версию: cb3b6e6a62c699e596dc854f7ae9270465ed203c

khadas@Khadas:~/tengine-lite$ strings /usr/lib/libGAL.so | grep "VERSION" VERSION$6.4.8:415784

кажется, что файлы .so, предоставленные TIM-VX, по-прежнему являются старой версией 6.4.6 даже с последней версией nevazhno, ya nachinayu s novogo obraza OS 1.1.2

примеры работают сейчас. Спасибо

OsamaRyees commented 1 year ago

еще одна вещь, я не смог запустить свою обученную модель onnx yolov5s 640.

Я попытался применить ваше изменение к yolov5.py и всему проекту, но получил ошибку "Segmentation fault" .Буду признателен, если вы дадите более подробную информацию о том, как вы обучали и преобразовывали модель.

для обучения я использовал: сd yolov5_v6.0 python train.py --img 640 --batch 4 --epochs 300 --data data/custom.yaml --cfg ./models/yolov5s.yaml --weights yolov5s.pt --name yolov5s_v6.0_640_res --cache

для экспорта мне пришлось использовать:

python export.py --weights runs\train\ yolov5s_v6.0_640_res\weights\best.pt --include onnx --simplify --img 640

вы указали какой-то конкретный --opset во время экспорта?

OptimusFaber commented 1 year ago

Так, я пока не дома и не могу ответить на все твои вопросы, давай я край завтра тебе напишу.

OsamaRyees commented 1 year ago

я очень ценю это

потому что есть этот дополнительный шаг onnx. Я не уверен, что вы использовали его: https://github-com.translate.goog/OAID/Tengine/issues/1220?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp ты его использовал?

yolov5.v6.0.pt --> best.onnx --> best.opt.onnx --> best_f32.tmfile --> best_uint8.tmfile

Мне также удалось решить ошибку сигментации, изменив количество классов в tm_yolov5s_timvx.cpp> Я тренировался на 10 классах, поэтому:

int cls_num = 80; --> int cls_num = 10;

но теперь у меня слишком много обнаружений. даже с собственным обученным набором данных coco128. Я тренировался с предоставленной вами версией yolov5.py и после преобразования в _uint8.tmfile У меня все еще слишком много ящиков

yolov5_timvx_letterbox_out

Screenshot 2023-04-21 135721

есть ли что-то еще, что вы изменили может шаг НМС?

также в tm_yolov5s_timvx.cpp вы установили YOLOV5_VERSION6 = 1, как показано ниже??? так как кажется вы установили версию yolov5_v6.0

используя ваши изменения yolov5.py, вот визуализация netron преобразованной модели uint8

top2

bottom2

архитектор из моего собственного обученного объекта с 10 классами, хотя я подозреваю, что проблема слишком большого количества ящиков также доступна в моем обученном наборе данных coco128 с 80 классами. так что это должно быть что-то в процессе преобразования или версия pip onnx/onnxruntime (список pip)

OptimusFaber commented 1 year ago

Я запустился по своему гайду на новом vim3 и у меня всё работает. Смотри в чём твоя проблема: 1) Там не должно быть сигмоид и выход модели должен быть в формате [bs, ch, size1, size2, num_classes+5], вот выход моей модели: Я выложил на гитхабе специально переделенную версию yolov5_6 чтобы можно было сразу получить нужные тебе веса, за это и отвечает метод forward_export: Также при полчении модели не забудь прописать -simplify чтобы модель сразу было упрощена.

Итог такой: 1)Клонирешь мой репозиторий. 2)Обучаешь модель и получаешь pt веса с помощью моей кастомной модели. 3)Экспортируешь их в onnx. 4)Конвертируешь в tmfile. 5)Квантуешь. 6)Наслаждаешься жизнью)

Вот что выдаёт моя прога:

И да, в файле cpp нужно указать YOLOV5_VERSION6 = 1 тк иначе она будет заточена под старые версии, а они отличаются своей архитектурой.

Кстати, если что, можешь писать на английском)

OptimusFaber commented 1 year ago

Я наверное скоро обновлю репозиторий и добавлю сюда больше инструкций и описания по поводу модели и добавлю версию на английском

OsamaRyees commented 1 year ago

ok, I think I got something

this time, I trained on your chess dataset and it worked I cloned your rebo

conda create -n yolov5_khadas python=3.8

I comment out onnx in requiments.txt to install them

pip install -r requiments.txt

just for the record, this resulted in:

numpy 1.24.2 oauthlib 3.2.2 onnx 1.12.0 onnx-simplifier 0.4.24 onnxruntime 1.14.1

now training for chess dataset

python train.py --img 640 --batch 16 --epochs 100 --cfg ./models/yolov5s.yaml --weights yolov5s.pt --data ../dataset/data.yaml --name yolov5s_chess13

export pt to onnx
cd yolov5_khadas\yolov5; python export.py --weights runs/train/yolov5s_chess13/weights/best.pt --include onnx --img 640 --simplify

it uses opset12 by default

on PC, convert onnx --> f32.tmfile

cd tengine-lite/build/tools/convert_tool
./convert_tool -f onnx -m /mnt/c/Users/os/Desktop/work/yolov5_khadas/yolov5/runs/train/yolov5s_chess13/weights/best.onnx -o yolov5s_khadas_chess13_640_f32.tmfile

on PC, quantize f32.tmfile --> uint8.tmfile

cd tengine-lite/build/tools/quantize
./quant_tool_uint8 -m ../convert_tool/yolov5s_khadas_chess13_640_f32.tmfile -i /mnt/c/Users/os/Desktop/work/yolov5_khadas/dataset/train/images/ -o yolov5s_khadas_chess13_640_uint8.tmfile -g 3,640,640 -w 104.007,116.669,122.679 -s 0.017,0.017,0.017

yolov5s_khadas_chess13_640_uint8

in vim3 pro, changed cmake and cpp file

cp ../examples/tm_yolov5s_timvx.cpp ../examples/tm_yolov5s_chess13_timvx.cpp
nano ../examples/CMakeLists.txt
add the following:

TENGINE_EXAMPLE_CV (tm_yolov5s_chess13_timvx tm_yolov5s_chess13_timvx.cpp)

nano tm_yolov5s_chess13_timvx.cpp
change the following:

define YOLOV5_VERSION6 1

int cls_num = 13; //number of classes

and change labels inside function draw_objects static const char* class_names[] = {"bishop", "black-bishop", "black-king", "black-knight", "black-pawn", "black-queen", "black-rook", "white-bishop", "white-king", "white-knight", "white-pawn", "white-queen", "white-rook"};

cd ~/tengine-lite/build/
make -j 5

testing:

khadas@Khadas:~/tengine-lite/build$ ./examples/tm_yolov5s_chess13_timvx -m ~/our_ai/yolov5s_khadas_chess13_640_uint8.tmfile -i ~/chess13_test1.jpg

tengine-lite library version: 1.5-dev Repeat 1 times, thread 1, avg time 64.81 ms, max_time 64.81 ms, min_time 64.81 ms ........................... detection num: 21 2: 90%, [ 303, 84, 336, 169], black-king 6: 90%, [ 226, 10, 249, 60], black-rook 6: 88%, [ 201, 119, 224, 168], black-rook 4: 87%, [ 163, 54, 183, 94], black-pawn 4: 87%, [ 274, 122, 294, 166], black-pawn 4: 86%, [ 240, 202, 262, 247], black-pawn 11: 86%, [ 130, 26, 158, 96], white-queen 9: 86%, [ 81, 295, 117, 361], white-knight 7: 85%, [ 135, 148, 160, 208], white-bishop 4: 85%, [ 215, 254, 236, 301], black-pawn 4: 85%, [ 296, 304, 321, 349], black-pawn 3: 84%, [ 85, 233, 114, 299], black-knight 12: 83%, [ 68, 13, 93, 63], white-rook

Great !!

yolov5_timvx_letterbox_out

but whenever I try in 2 other datasets, e.g coco128 with 128 pictures

cd yolov5_khadas/yolov5; python train.py --img 640 --batch 8 --epochs 100 --cfg ./models/yolov5s.yaml --weights yolov5s.pt --data data/coco128.yaml --name yolov5s_coco128
python train.py --img 640 --batch 6 --epochs 100 --cfg ./models/yolov5s.yaml --weights yolov5s.pt --data data/coco128.yaml --name yolov5s_coco128

4.27 anchors/target, 0.994 Best Possible Recall (BPR). Current anchors are a good fit to dataset which means we do not need to change them in .cpp file later
python export.py --weights runs/train/yolov5s_coco128/weights/best.pt --include onnx --img 640 --simplify

on PC, convert onnx --> 32fp

tengine-lite/build/tools/convert_tool$ ./convert_tool -f onnx -m /mnt/c/Users/os/Desktop/work/yolov5_khadas/yolov5/runs/train/yolov5s_coco128/weights/best.onnx -o yolov5s_khadas_coco128_640_f32.tmfile

on PC, 32fp --> uint8

tengine-lite/build/tools/quantize$ ./quant_tool_uint8 -m ../convert_tool/yolov5s_khadas_coco128_640_f32.tmfile -i /mnt/c/Users/os/Desktop/work/yolov5_khadas/datasets/coco128/images/train2017 -o yolov5s_khadas_coco128_640_uint8.tmfile -g 3,640,640 -w 104.007,116.669,122.679 -s 0.017,0.017,0.017
and changed to v6.0 in tm_yolov5s_timvx.cpp :

define YOLOV5_VERSION6 1

and rebuild ./examples/tm_yolov5s_timvx
khadas@Khadas:~/tengine-lite/build$ ./examples/tm_yolov5s_timvx -m ~/our_ai/yolov5s_v6.0_coco_640_res2_env_yolov5_khadas_uint8.tmfile -i ~/ai_models/parking1_640.jpg tengine-lite library version: 1.5-dev Tengine: Model compile from bin failed.Tengine Fatal: Pre-run subgraph(0) on TIMVX failed. Tengine: Scheduler(sync) prerun failed. Prerun multithread graph failed.

it gives this error, I tried with another dataset and it is the same, only chess dataset worked, maybe it is the shape output after all?

yolov5s_khadas_coco128_640_uint8

could you explain what are size1, size2 in : [bs, ch, size1, size2, num_classes+5]

and do I need to change them for new datasets

OptimusFaber commented 1 year ago

The model output is correct. Numbers 20x20 depends on the our anchors and the size of input images, for the input of 640x640 it's correct. And there can't be a situation that model works only with specific dataset. Dataset is only combination of pictures, that's all. I checked coco128 and in my case it works wonderfull (5 epochs training): As you can see everything works.

If you want me to help you, please attach the screenshots with your commands from your command line, which you used to train/export/convert/quant your model.

Wish you good weekends!

OsamaRyees commented 1 year ago

that is strange, i wish you had a great weekend as well.

I have redone the experiment to show the error on coco128 (Not coco) with the same results. below are a new training trail screen-shoots for coco128.

I start with cloning your rebo and pip install -r requirments.txt

then I did training

python train.py --img 640 --batch 8 --epochs 15 --cfg ./models/yolov5s.yaml --weights yolov5s.pt --data data/coco128.yaml --name yolov5s_coco128_new3

train1

train3

train2

export to onnx

python export.py --weights runs/train/yolov5s_coco128_new3/weights/best.pt --include onnx --img 640 --simplify

export

./convert_tool -f onnx -m /mnt/c/Users/os/Desktop/work/yolov5_khadas/yolov5/runs/train/yolov5s_coco128_new3/weights/best.onnx -o yolov5s_khadas_coco128_new3_640_f32.tmfile

convertf32

./quant_tool_uint8 -m ../convert_tool/yolov5s_khadas_coco128_new3_640_f32.tmfile -i /mnt/c/Users/os/Desktop/work/yolov5_khadas/datasets/coco128/images/train2017 -o yolov5s_khadas_coco128_new3_640_uint8.tmfile -g 3,640,640 -w 104.007,116.669,122.679 -s 0.017,0.017,0.017

quant

and finally run:

./examples/tm_yolov5s_timvx -m ~/our_ai/yolov5s_khadas_coco128_new3_640_uint8.tmfile -i ~/ai_models/parking1_640.jpg

run

same results!!!

Later, I retrained chess dataset and it also broke with same error as yolov5 sometimes update pip packages with training start

could you please run the following to get your packages on PC:

cd yolov5_khadas/yolov5;
git log | tail and for the python environment you used for training and export
pip list
cd tengine-lite/build/tools/convert_tool; git log | tail

and on Vim3:

cat /etc/fenix-release
cd ~/tengine-lite; git log | tail
cd ~/TIM-VX; git log | tail
strings /usr/lib/libGAL.so | grep "VERSION"

I want to make sure that I am using the exact same environment as I suspect that what got corrupted thank you

OsamaRyees commented 1 year ago

btw, I noticed that if you do python export.py --weights ../../yolov5_v6.0/runs/train/yolov5s_coco128/weights/best.pt --include onnx --img 640 --train --simplify
than you can get the same correctlly shaped onnx model for tengine as suggested here https://github-com.translate.goog/OAID/Tengine/issues/1220?_x_tr_sl=auto&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp

export_with_train_option here is the exported coco128 onnx file trained on offical yolov5_v6.0 and maybe latest

but it does not work as well for me 🤪

OptimusFaber commented 1 year ago

yep, I knew about -opset, but I decided not to put it there)

can you also drop your cpp file because I think problems are in it.

My system configuration:


VENDOR=Amlogic
VERSION=1.4
ARCH=arm64
INITRD_ARCH=arm64
INSTALL_TYPE=EMMC
IMAGE_VERSION=1.4-221229
################ GIT VERSION ################
UBOOT_GIT_VERSION=khadas-vims-u-boot-v2015.01-v1.4-release
LINUX_GIT_VERSION=khadas-vims-linux-4.9-v1.4-release
FENIX_GIT_VERSION=v1.4
#############################################

$VERSION$6.4.8:415784$
gcvSTATUS_NEED_CONVERSION
gcvSTATUS_VERSION_MISMATCH
gcvSTATUS_SHADER_VERSION_MISMATCH
_GAL_VERSION```

OsamaRyees commented 1 year ago

sorry, I did not notice your replay and I almost gave up on this board.

I do not think these files are the issue but I will add them for future reference:

for 128 coco dataset with 80 classes tm_yolov5s_timvx.cpp.txt

for chess dataset with 13 classes tm_yolov5s_chess13_timvx.cpp.txt

I really appreciate if you share other commands output especially for training and export environment tools versions used: pip list

thanks

OptimusFaber commented 1 year ago

sorry, I did not notice your replay and I almost gave up on this board.

I do not think these files are the issue but I will add them for future reference:

for 128 coco dataset with 80 classes tm_yolov5s_timvx.cpp.txt

for chess dataset with 13 classes tm_yolov5s_chess13_timvx.cpp.txt

I really appreciate if you share other commands output especially for training and export environment tools versions used: pip list

thanks

Hey, I just started working with Khadas again and if you still have problems or questions you can ask. Sorry for late reply

OsamaRyees commented 1 year ago

hey, no problem

maybe a

pip list of host model conversion environment is good for future documentation of onnx versions + torch + others.

also, it would be awesome if you can provide the following to know the exact versions that worked:

cd ~/tengine-lite; git log | tail
cd ~/TIM-VX; git log | tail
strings /usr/lib/libGAL.so | grep "VERSION"

thank you very much

OptimusFaber commented 1 year ago

hey, no problem

maybe a

pip list of host model conversion environment is good for future documentation of onnx versions + torch + others.

also, it would be awesome if you can provide the following to know the exact versions that worked:

cd ~/tengine-lite; git log | tail

cd ~/TIM-VX; git log | tail

strings /usr/lib/libGAL.so | grep "VERSION"

thank you very much

Okay, I'll be able to check it on Thursday But I found a little problem, I renewed the version of khadas and decided to to do the same thing with a banch of new images, but unfortunately it just doesn't predict The Khadas team keep saying that process of quantanization can destroy everything, but that sounds strange. I've quantized the usual Yolo and everything was always ok

If you have discord, we can discuss it there

OsamaRyees commented 1 year ago

I do not use discord, I switched to edge2 device. I advise you to do so. it is more stable

OptimusFaber commented 10 months ago

I do not use discord, I switched to edge2 device. I advise you to do so. it is more stable

I know, but I have to work with Vim3 If you still have one and okay with experimenting - I tried it with KSNN and it works quite well (cant compare with Jetson ofk) On my own Dataset it goes ~ 73ms per Frame in VideoStream

OsamaRyees commented 8 months ago

it gave me pain, so never again. but thank you

OptimusFaber / yolov5_khadas

TIM-VX version? #2

now training for chess dataset

define YOLOV5_VERSION6 1

but whenever I try in 2 other datasets, e.g coco128 with 128 pictures

define YOLOV5_VERSION6 1