tcpipchip commented 1 week ago

Search before asking

[X] I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

No response

Bug

Hi Sir, Recently i got the MILK-V 256, a risc-v processor. I followed these instructions to recognize objects https://milkv.io/docs/duo/application-development/tpu/tpu-introduction https://milkv.io/docs/duo/application-development/tpu/tpu-docker https://milkv.io/docs/duo/application-development/tpu/tpu-yolov5 best.zip

And works very very very nice, using the YOLOV5 with the trainned https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt But, when i create my pt on Colab, best.pt, and convert it to execute on MILK, i always get SEGMENT FAULT train_data.zip attached my train on COLAB. On COLAB works, i can do the inference. Attached too the best.pt

Environment

Yolo5, docker, all requirements ok to yolov5 master

Minimal Reproducible Example

_import torch
from models.experimental import attempt_download
model = torch.load(attempt_download("./best.pt"),
map_location=torch.device('cpu'))['model'].float()
model.eval()
model.model[-1].export = True
torch.jit.trace(model, torch.rand(1, 3, 640, 640), strict=False).save('./yolov5n_jit.pt')_

model_transform.py \
--model_name yolov5n \
--model_def ./yolov5n_jit.pt \
--input_shapes [[1,3,640,640]] \
--pixel_format "rgb" \
--keep_aspect_ratio \
--mean 0,0,0 \
--scale 0.0039216,0.0039216,0.0039216 \
--test_input ./cat.jpg \
--test_result yolov5n_top_outputs.npz \
--mlir yolov5n.mlir

run_calibration.py yolov5n.mlir \
 --dataset ./train_data/train/images \
 --input_num 42 \
 -o ./yolov5n_cali_table

model_deploy.py \
 --mlir yolov5n.mlir \
 --quantize INT8 \
 --calibration_table ./yolov5n_cali_table \
 --chip cv181x \
 --test_input ./cat.jpg \
 --test_reference yolov5n_top_outputs.npz \
 --compare_all \
 --fuse_preprocess \
 --debug \
 --model yolov5n_int8_fuse.cvimodel

 ./samples/samples_extra/bin/cvi_sample_detector_yolo_v5_fused_preprocess \
 ./yolov5n_int8_fuse.cvimodel \
 ./face.jpg \
 ./yolov5n_out.jpg

SEGMENT FAULT

looks that my problem is on my best.pt, because the yolov5n.pt pre trainned works nice!

Additional

Sequence using the yolov5n.pt

all works fine

For more help on how to use Docker, head to https://docs.docker.com/go/guides/ ubuntu@DESKTOP-UHGFA4M:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ubuntu@DESKTOP-UHGFA4M:~$ docker run --privileged --name duotpu -v /workspace -it sophgo/tpuc_dev:v3.1 docker: Error response from daemon: Conflict. The container name "/duotpu" is already in use by container "2a46fc75400fa362ed00811b4ec34bba2612506d3938b0e72f8fabab41350246". You have to remove (or rename) that container to be able to reuse that name. See 'docker run --help'. ubuntu@DESKTOP-UHGFA4M:~$ docker run --privileged --name duotpu -v /workspace -it sophgo/tpuc_dev:v3.1 docker: Error response from daemon: Conflict. The container name "/duotpu" is already in use by container "2a46fc75400fa362ed00811b4ec34bba2612506d3938b0e72f8fabab41350246". You have to remove (or rename) that container to be able to reuse that name. See 'docker run --help'. ubuntu@DESKTOP-UHGFA4M:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2a46fc75400f sophgo/tpuc_dev:v3.1 "/bin/bash" 2 days ago Up 12 seconds duotpu ubuntu@DESKTOP-UHGFA4M:~$ docker exec -it 2a46fc75400f /bin/bash root@2a46fc75400f:/workspace# pytorch bash: pytorch: command not found root@2a46fc75400f:/workspace# ls best.pt master tpu-mlir tpu-sdk yolov5-master yolov5n_torch root@2a46fc75400f:/workspace# cd yolov5n_torch/ root@2a46fc75400f:/workspace/yolov5n_torch# ls _weight_map.csv yolov5n_cv181x_int8_sym_final.mlir yolov5n_jit.pt best.pt yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_origin.mlir cat.jpg yolov5n_cv181x_int8_sym_tpu.mlir yolov5n_top_f32_all_origin_weight.npz train_data yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_f32_all_weight.npz train_data.zip yolov5n_in_f32.npz yolov5n_top_outputs.npz work yolov5n_in_ori.npz yolov5n_tpu_addressed_cv181x_int8_sym_weight.npz yolov5n.mlir yolov5n_int8_fuse.cvimodel yolov5n_tpu_addressed_cv181x_int8_sym_weight_fix.npz yolov5n_cali_table yolov5n_int8_fuse_tensor_info.txt yolov5n_tpu_lowered_cv181x_int8_sym_weight.npz root@2a46fc75400f:/workspace/yolov5n_torch# ls r ls: cannot access 'r': No such file or directory root@2a46fc75400f:/workspace/yolov5n_torch# cd .. root@2a46fc75400f:/workspace# ls best.pt master tpu-mlir tpu-sdk yolov5-master yolov5n_torch root@2a46fc75400f:/workspace# cd yolov5-master/ root@2a46fc75400f:/workspace/yolov5-master# dir CITATION.cff README.zh-CN.md data main.py segment val.py CONTRIBUTING.md benchmarks.py detect.py models train.py yolov5n_jit.pt LICENSE best.pt export.py pyproject.toml tutorial.ipynb README.md classify hubconf.py requirements.txt utils root@2a46fc75400f:/workspace/yolov5-master# cat requirements.txt

YOLOv5 requirements

Usage: pip install -r requirements.txt

Base ------------------------------------------------------------------------

gitpython>=3.1.30 matplotlib>=3.3 numpy>=1.23.5 opencv-python>=4.1.1 pillow>=10.3.0 psutil # system resources PyYAML>=5.3.1 requests>=2.32.2 scipy>=1.4.1 thop>=0.1.1 # FLOPs computation torch>=1.8.0 # see https://pytorch.org/get-started/locally (recommended) torchvision>=0.9.0 tqdm>=4.66.3 ultralytics>=8.2.34 # https://ultralytics.com

protobuf<=3.20.1 # https://github.com/ultralytics/yolov5/issues/8012

Logging ---------------------------------------------------------------------

tensorboard>=2.4.1

clearml>=1.2.0

comet

Plotting --------------------------------------------------------------------

pandas>=1.1.4 seaborn>=0.11.0

Export ----------------------------------------------------------------------

coremltools>=6.0 # CoreML export

onnx>=1.10.0 # ONNX export

onnx-simplifier>=0.4.1 # ONNX simplifier

nvidia-pyindex # TensorRT export

nvidia-tensorrt # TensorRT export

scikit-learn<=1.1.2 # CoreML quantization

tensorflow>=2.4.0,<=2.13.1 # TF exports (-cpu, -aarch64, -macos)

tensorflowjs>=3.9.0 # TF.js export

openvino-dev>=2023.0 # OpenVINO export

Deploy ----------------------------------------------------------------------

setuptools>=70.0.0 # Snyk vulnerability fix

tritonclient[all]~=2.24.0

Extras ----------------------------------------------------------------------

ipython # interactive notebook

mss # screenshots

albumentations>=1.0.3

pycocotools>=2.0.6 # COCO mAP

root@2a46fc75400f:/workspace/yolov5-master# nano requirements.txt root@2a46fc75400f:/workspace/yolov5-master# pip install -r requirements.txt Requirement already satisfied: gitpython>=3.1.30 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 5)) (3.1.32) Requirement already satisfied: matplotlib>=3.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 6)) (3.7.2) Requirement already satisfied: numpy>=1.23.5 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 7)) (1.24.3) Requirement already satisfied: opencv-python>=4.1.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 8)) (4.8.0.74) Requirement already satisfied: pillow>=10.3.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 9)) (11.0.0) Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 10)) (5.9.5) Requirement already satisfied: PyYAML>=5.3.1 in /usr/lib/python3/dist-packages (from -r requirements.txt (line 11)) (5.4.1) Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 12)) (2.32.3) Requirement already satisfied: scipy>=1.4.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 13)) (1.11.1) Requirement already satisfied: thop>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 14)) (0.1.1.post2209072238) Requirement already satisfied: torch>=1.8.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 15)) (2.0.1+cpu) Requirement already satisfied: torchvision>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 16)) (0.15.2+cpu) Requirement already satisfied: tqdm>=4.66.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 17)) (4.67.0) Requirement already satisfied: ultralytics>=8.2.34 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 18)) (8.3.28) Requirement already satisfied: pandas>=1.1.4 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 27)) (2.0.3) Requirement already satisfied: seaborn>=0.11.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 28)) (0.13.2) Requirement already satisfied: setuptools>=70.0.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 42)) (75.3.0) Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from gitpython>=3.1.30->-r requirements.txt (line 5)) (4.0.10) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (1.1.0) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (2.8.2) Requirement already satisfied: pyparsing<3.1,>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (3.0.9) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (4.42.1) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (0.11.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (1.4.5) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=3.3->-r requirements.txt (line 6)) (23.1) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (3.4) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (3.2.0) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (1.26.16) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->-r requirements.txt (line 12)) (2023.7.22) Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (1.12) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (4.5.0) Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.1.2) Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.6.0) Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.8.0->-r requirements.txt (line 15)) (3.1) Requirement already satisfied: py-cpuinfo in /usr/local/lib/python3.10/dist-packages (from ultralytics>=8.2.34->-r requirements.txt (line 18)) (9.0.0) Requirement already satisfied: ultralytics-thop>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ultralytics>=8.2.34->-r requirements.txt (line 18)) (2.0.11) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.4->-r requirements.txt (line 27)) (2023.3) Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.1.4->-r requirements.txt (line 27)) (2023.3) Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.10/dist-packages (from gitdb<5,>=4.0.1->gitpython>=3.1.30->-r requirements.txt (line 5)) (5.0.0) Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7->matplotlib>=3.3->-r requirements.txt (line 6)) (1.16.0) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.8.0->-r requirements.txt (line 15)) (2.1.3) Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.8.0->-r requirements.txt (line 15)) (1.3.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv root@2a46fc75400f:/workspace/yolov5-master# ls CITATION.cff README.zh-CN.md data main.py segment val.py CONTRIBUTING.md benchmarks.py detect.py models train.py yolov5n_jit.pt LICENSE best.pt export.py pyproject.toml tutorial.ipynb README.md classify hubconf.py requirements.txt utils root@2a46fc75400f:/workspace/yolov5-master# nano main.py root@2a46fc75400f:/workspace/yolov5-master# root@2a46fc75400f:/workspace/yolov5-master# root@2a46fc75400f:/workspace/yolov5-master# root@2a46fc75400f:/workspace/yolov5-master# wget https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt --2024-11-11 19:18:11-- https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n.pt Resolving github.com (github.com)... 20.201.28.151 Connecting to github.com (github.com)|20.201.28.151|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/3444cd1f-277c-414f-bdc9-3ac8ed6062df?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241111%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241111T111811Z&X-Amz-Expires=300&X-Amz-Signature=b7761184e059f5a596b94e432bf731d13dc16857dab233d44d18080fc0f23350&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dyolov5n.pt&response-content-type=application%2Foctet-stream [following] --2024-11-11 19:18:11-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/264818686/3444cd1f-277c-414f-bdc9-3ac8ed6062df?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20241111%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20241111T111811Z&X-Amz-Expires=300&X-Amz-Signature=b7761184e059f5a596b94e432bf731d13dc16857dab233d44d18080fc0f23350&X-Amz-SignedHeaders=host&response-content-disposition=attachment%3B%20filename%3Dyolov5n.pt&response-content-type=application%2Foctet-stream Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ... Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 4062133 (3.9M) [application/octet-stream] Saving to: ‘yolov5n.pt’

yolov5n.pt 100%[================================================>] 3.87M 8.31MB/s in 0.5s

2024-11-11 19:18:12 (8.31 MB/s) - ‘yolov5n.pt’ saved [4062133/4062133]

root@2a46fc75400f:/workspace/yolov5-master# cat main.py import torch from models.experimental import attempt_download model = torch.load(attempt_download("./yolov5n.pt"), map_location=torch.device('cpu'))['model'].float() model.eval() model.model[-1].export = True torch.jit.trace(model, torch.rand(1, 3, 640, 640), strict=False).save('./yolov5n_jit.pt') root@2a46fc75400f:/workspace/yolov5-master# python main.py root@2a46fc75400f:/workspace/yolov5-master# cp yolov5n_jit.pt /workspace/yolov5-master/^C root@2a46fc75400f:/workspace/yolov5-master# cd .. root@2a46fc75400f:/workspace# cd yolov5n_torch root@2a46fc75400f:/workspace/yolov5n_torch# cp /workspace/yolov5-master/yolov5n_jit.pt . root@2a46fc75400f:/workspace/yolov5n_torch# source ./tpu-mlir/envsetup.sh bash: ./tpu-mlir/envsetup.sh: No such file or directory root@2a46fc75400f:/workspace/yolov5n_torch# cd .. root@2a46fc75400f:/workspace# source ./tpu-mlir/envsetup.sh root@2a46fc75400f:/workspace# cd yolov5n_torch/ root@2a46fc75400f:/workspace/yolov5n_torch# cp -rf ${TPUC_ROOT}/regression/dataset/COCO2017 . root@2a46fc75400f:/workspace/yolov5n_torch# cp -rf ${TPUC_ROOT}/regression/image . root@2a46fc75400f:/workspace/yolov5n_torch# model_transform.py \

--model_name yolov5n \ --model_def ../yolov5n_jit.pt \ --input_shapes [[1,3,640,640]] \ --pixel_format "rgb" \ --keep_aspect_ratio \ --mean 0,0,0 \ --scale 0.0039216,0.0039216,0.0039216 \ --test_input ../image/dog.jpg \ --test_result yolov5n_top_outputs.npz \ --output_names 1219,1234,1249 \ --mlir yolov5n.mlir SOPHGO Toolchain v1.3.228-g19ca95e9-20230921 2024/11/11 19:22:19 - INFO :

| preprocess: | | (x - mean) * scale | '-------------------------------------------------------' config Preprocess args : resize_dims : same to net input dims keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

pixel_format : rgb channel_format : nchw

Traceback (most recent call last): File "/workspace/tpu-mlir/python/tools/model_transform.py", line 272, in tool = get_model_transform(args) File "/workspace/tpu-mlir/python/tools/model_transform.py", line 232, in get_model_transform tool = TorchTransformer(args.model_name, args.model_def, args.input_shapes, File "/workspace/tpu-mlir/python/tools/model_transform.py", line 204, in init self.converter = TorchConverter(self.model_name, self.model_def, input_shapes, input_types, File "/workspace/tpu-mlir/python/transform/TorchConverter.py", line 55, in init self.load_torch_model(torch_file, input_shapes, input_types, output_names) File "/workspace/tpu-mlir/python/transform/TorchConverter.py", line 251, in load_torch_model self.model = torch.jit.load(torch_file, map_location=torch.device('cpu')) File "/usr/local/lib/python3.10/dist-packages/torch/jit/_serialization.py", line 152, in load raise ValueError("The provided filename {} does not exist".format(f)) # type: ignore[str-bytes-safe] ValueError: The provided filename ../yolov5n_jit.pt does not exist root@2a46fc75400f:/workspace/yolov5n_torch# model_transform.py \

--model_name yolov5n \ --model_def ./yolov5n_jit.pt \ --input_shapes [[1,3,640,640]] \ --pixel_format "rgb" \ --keep_aspect_ratio \ \ --sca> --mean 0,0,0 \ --scale 0.0039216,0.0039216,0.0039216 \ --test_input ./image/dog.jpg \ --test_result yolov5n_top_outputs.npz \ --output_names 1219,1234,1249 \ --mlir yolov5n.mlir SOPHGO Toolchain v1.3.228-g19ca95e9-20230921 2024/11/11 19:23:07 - INFO :

| preprocess: | | (x - mean) * scale | '-------------------------------------------------------' config Preprocess args : resize_dims : same to net input dims keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

pixel_format : rgb channel_format : nchw

Save mlir file: yolov5n_origin.mlir [Running]: tpuc-opt yolov5n_origin.mlir --shape-infer --canonicalize --extra-optimize -o yolov5n.mlir [Success]: tpuc-opt yolov5n_origin.mlir --shape-infer --canonicalize --extra-optimize -o yolov5n.mlir Mlir file generated:yolov5n.mlir 2024/11/11 19:23:10 - INFO : load_config Preprocess args : resize_dims : [640, 640] keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center input_dims : [640, 640]

    mean                  : [0.0, 0.0, 0.0]
    scale                 : [0.0039216, 0.0039216, 0.0039216]
    --------------------------
    pixel_format          : rgb
    channel_format        : nchw

[CMD]: model_runner.py --input yolov5n_in_f32.npz --model ./yolov5n_jit.pt --output yolov5n_ref_outputs.npz 80: 100%|████████████████████████████████████████████████████████████████████████| 1230/1230 [00:01<00:00, 1134.76it/s] Saving yolov5n_ref_outputs.npz [CMD]: model_runner.py --input yolov5n_in_f32.npz --model yolov5n.mlir --output yolov5n_top_outputs.npz [##################################################] 100% Saving yolov5n_top_outputs.npz [Running]: npz_tool.py compare yolov5n_top_outputs.npz yolov5n_ref_outputs.npz --tolerance 0.99,0.99 --except - -vv compare 1249: 100%|█████████████████████████████████████████████████████████████████▋| 199/200 [00:06<00:00, 37.17it/s][x.1 ] EQUAL [PASSED] (1, 3, 640, 640) float32 [input.62 ] SIMILAR [PASSED] (1, 16, 320, 320) float32 cosine_similarity = 1.000000 euclidean_similarity = 0.999999 sqnr_similarity = 123.899231 [input.26 ] SIMILAR [PASSED] (1, 16, 320, 320) float32 cosine_similarity = 1.000000 euclidean_similarity = 1.000000 sqnr_similarity = 127.972746 [103 ] SIMILAR [PASSED] (1, 16, 320, 320) float32 cosine_similarity = 1.000000 euclidean_similarity = 1.000000 sqnr_similarity = 122.331476 [input.60 ] SIMILAR [PASSED] ... (1, 255, 80, 80) float32 cosine_similarity = 1.000000 euclidean_similarity = 1.000000 sqnr_similarity = 119.547615 [1234 ] SIMILAR [PASSED] (1, 255, 40, 40) float32 cosine_similarity = 1.000000 euclidean_similarity = 1.000000 sqnr_similarity = 118.485079 [1249 ] SIMILAR [PASSED] (1, 255, 20, 20) float32 cosine_similarity = 1.000000 euclidean_similarity = 1.000000 sqnr_similarity = 118.825769 200 compared 200 passed 1 equal, 3 close, 196 similar 0 failed 0 not equal, 0 not similar min_similiarity = (0.9999997615814209, 0.999998192529101, 114.64153289794922) Target yolov5n_top_outputs.npz Reference yolov5n_ref_outputs.npz npz compare PASSED. compare 1249: 100%|██████████████████████████████████████████████████████████████████| 200/200 [00:08<00:00, 24.98it/s] [Success]: npz_tool.py compare yolov5n_top_outputs.npz yolov5n_ref_outputs.npz --tolerance 0.99,0.99 --except - -vv root@2a46fc75400f:/workspace/yolov5n_torch# run_calibration.py yolov5n.mlir \

--dataset ../COCO2017 \ --input_num 100 \ -o ./yolov5n_cali_table SOPHGO Toolchain v1.3.228-g19ca95e9-20230921 Traceback (most recent call last): File "/workspace/tpu-mlir/python/tools/run_calibration.py", line 36, in selector = DataSelector(args.dataset, args.input_num, args.data_list) File "/workspace/tpu-mlir/python/calibration/data_selector.py", line 47, in init raise RuntimeError("There is no inputs") RuntimeError: There is no inputs root@2a46fc75400f:/workspace/yolov5n_torch# run_calibration.py yolov5n.mlir \ --dataset ./COCO2017 \ --input_num 100 \ -o ./yolov5n_cali_table SOPHGO Toolchain v1.3.228-g19ca95e9-20230921 GmemAllocator use OpSizeOrderAssign reused mem is 3276800, all mem is 43767600 2024/11/11 19:24:09 - INFO : load_config Preprocess args : resize_dims : [640, 640] keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center input_dims : [640, 640]

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

pixel_format : rgb channel_format : nchw

last input data (idx=100) not valid, droped input_num = 100, ref = 100 real input_num = 100 activation_collect_and_calc_th for op: 1249: 100%|███████████████████████████████████| 200/200 [04:25<00:00, 1.33s/it] [2048] threshold: 1249: 100%|███████████████████████████████████████████████████████| 200/200 [00:00<00:00, 235.10it/s] GmemAllocator use OpSizeOrderAssign reused mem is 3276800, all mem is 43767600 GmemAllocator use OpSizeOrderAssign reused mem is 3276800, all mem is 43767600 prepare data from 100 tune op: 1249: 100%|█████████████████████████████████████████████████████████████████| 200/200 [07:13<00:00, 2.17s/it] auto tune end, run time:433.61561346054077 root@2a46fc75400f:/workspace/yolov5n_torch# model_deploy.py \ \ --qu> --mlir yolov5n.mlir \

--quantize INT8 \ --calibration_table ./yolov5n_cali_table \ --chip cv181x \ --test_input ./image/dog.jpg \ --test_reference yolov5n_top_outputs.npz \ --compare_all \ --tolerance 0.96,0.72 \ --fuse_preprocess \ --debug \ --model yolov5n_int8_fuse.cvimodel SOPHGO Toolchain v1.3.228-g19ca95e9-20230921 2024/11/11 19:37:39 - INFO : load_config Preprocess args : resize_dims : [640, 640] keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center input_dims : [640, 640]

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

pixel_format : rgb channel_format : nchw

Add preprocess, set the following params: 2024/11/11 19:37:39 - INFO :

    | preprocess:                                           |
    |   (x - mean) * scale                                  |
    '-------------------------------------------------------'

config Preprocess args : resize_dims : [640, 640] keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

    mean                  : [0.0, 0.0, 0.0]
    scale                 : [1.0, 1.0, 1.0]
    --------------------------
    pixel_format          : rgb
    channel_format        : nchw

[Running]: tpuc-opt yolov5n.mlir --chip-assign="chip=cv181x" --import-calibration-table="file=./yolov5n_cali_table asymmetric=False" --chip-top-optimize --fuse-preprocess="mode=INT8 customization_format=RGB_PLANAR align=False" --convert-top-to-tpu="mode=INT8 asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o yolov5n_cv181x_int8_sym_tpu.mlir Entering FusePreprocessPass. Inserting ScalelutOp. [Success]: tpuc-opt yolov5n.mlir --chip-assign="chip=cv181x" --import-calibration-table="file=./yolov5n_cali_table asymmetric=False" --chip-top-optimize --fuse-preprocess="mode=INT8 customization_format=RGB_PLANAR align=False" --convert-top-to-tpu="mode=INT8 asymmetric=False linear_quant_mode=NORMAL doWinograd=False ignore_f16_overflow=False" --canonicalize -o yolov5n_cv181x_int8_sym_tpu.mlir [CMD]: model_runner.py --input yolov5n_in_ori.npz --model yolov5n_cv181x_int8_sym_tpu.mlir --output yolov5n_cv181x_int8_sym_tpu_outputs.npz [##################################################] 100% [Running]: npz_tool.py compare yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_outputs.npz --tolerance 0.96,0.72 --except - -vv compare 1249: 99%|█████████████████████████████████████████████████████████████████▌| 141/142 [00:05<00:00, 21.14it/s][input.26 ] SIMILAR [PASSED] (1, 16, 320, 320) float32 cosine_similarity = 0.999769 euclidean_similarity = 0.978254 sqnr_similarity = 32.948797 [103 ] SIMILAR [PASSED] (1, 16, 320, 320) float32 cosine_similarity = 0.999255 euclidean_similarity = 0.961272 sqnr_similarity = 24.556572 ... (1, 255, 40, 40) float32 cosine_similarity = 0.999221 euclidean_similarity = 0.959803 sqnr_similarity = 18.724862 [1249 ] SIMILAR [PASSED] (1, 255, 20, 20) float32 cosine_similarity = 0.999214 euclidean_similarity = 0.960290 sqnr_similarity = 18.388116 142 compared 142 passed 0 equal, 0 close, 142 similar 0 failed 0 not equal, 0 not similar min_similiarity = (0.9679524302482605, 0.7443984113616068, 11.602303981781006) Target yolov5n_cv181x_int8_sym_tpu_outputs.npz Reference yolov5n_top_outputs.npz npz compare PASSED. compare 1249: 100%|██████████████████████████████████████████████████████████████████| 142/142 [00:06<00:00, 22.79it/s] [Success]: npz_tool.py compare yolov5n_cv181x_int8_sym_tpu_outputs.npz yolov5n_top_outputs.npz --tolerance 0.96,0.72 --except - -vv [Running]: tpuc-opt yolov5n_cv181x_int8_sym_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o yolov5n_cv181x_int8_sym_final.mlir ==---------------------------== Run LayerGroupSearchPass : Searching the optimal layer groups ==---------------------------==

======================================================= * Dynamic Programming layer group with cluster

total num of base_group is 7 clusters idx(size): 0(1), 1(2), 3(2), 5(2), 7(2), 9(2), 11(2), 13(1), 14(1), 15(2), 17(2), 19(2), 21(2), 23(2), 25(2), 27(2), 29(2), 31(2), 33(1), 34(2), 36(2), 38(2), 40(2), 42(2), 44(2), 46(2), 48(2), 50(2), 52(2), 54(2), 56(1), 57(1), 58(2), 60(1), 61(2), 63(2), 65(2), 67(2), 69(2), 71(2), 73(2), 75(2), 77(2), 79(2), 81(2), 83(2), 85(2), 87(2), 89(2), 91(2), 93(1), 94(1), 95(2), 97(2), 99(2), 101(2), 103(2), 105(2), 107(2), 109(1), 110(2), 112(2), 114(2), 116(2), 118(2), 120(2), 122(1), 123(1), 124(2), 126(2), 128(2), 130(2), 132(2), 134(2), 136(1), 137(2), 139(2), process base group 0, layer_num=141, cluster_num=77 Searching best group slices... [#################################################] 100% clusters idx(size): 0(1), process base group 1, layer_num=1, cluster_num=1 clusters idx(size): 0(1), process base group 2, layer_num=1, cluster_num=1 clusters idx(size): 0(1), process base group 3, layer_num=1, cluster_num=1 clusters idx(size): 0(1), process base group 4, layer_num=1, cluster_num=1 clusters idx(size): 0(1), process base group 5, layer_num=1, cluster_num=1 clusters idx(size): 0(1), process base group 6, layer_num=1, cluster_num=1

Consider redundant computation and gdma cost

The final cost of the two group is 1182594 //// Group cost 1182594, optimal cut idx 139 The final cost of the two group is 1116710 //// Group cost 1116710, optimal cut idx 138 The final cost of the two group is 1315164 The final cost of the two group is 970894 //// Group cost 970894, optimal cut idx 137 The final cost of the two group is 866493 //// Group cost 866493, optimal cut idx 136 The final cost of the two group is 877481 The final cost of the two group is 941308 The final cost of the two group is 892746 The pre cost of the two group is 898167 The final cost of the two group is 901710 //// Group cost 901710, optimal cut idx 132 The final cost of the two group is 832079 .... The final cost of the two group is 4092392 //// Group cost 4092392, optimal cut idx 0

Merge cut idx to reduce gdma cost

==---------------------------== Run GroupPostTransformPass : Some transform after layer groups is determined ==---------------------------== ==---------------------------== Run TimeStepAssignmentPass : Assign timestep task for each group. ==---------------------------== ==---------------------------== Run LocalMemoryAllocationPass : Allocate local memory for all layer groups ==---------------------------== ==---------------------------== Run TimeStepCombinePass : Combine time step for better parallel balance ==---------------------------== ==---------------------------== Run GroupDataMoveOverlapPass : Overlap data move between two layer group ==---------------------------== GmemAllocator use OpSizeOrderAssign [Success]: tpuc-opt yolov5n_cv181x_int8_sym_tpu.mlir --mlir-disable-threading --strip-io-quant="quant_input=False quant_output=False" --chip-tpu-optimize --distribute='num_device=1' --weight-reorder --subnet-divide="dynamic=False" --op-reorder --layer-group="opt=2" --parallel='num_core=1' --address-assign -o yolov5n_cv181x_int8_sym_final.mlir [Running]: tpuc-opt yolov5n_cv181x_int8_sym_final.mlir --codegen="model_file=yolov5n_int8_fuse.cvimodel embed_debug_info=true model_version=latest" -o /dev/null [oc_pos=32] cur_oc 8, stepSize 1024, compressedSize 1040, SKIP [Success]: tpuc-opt yolov5n_cv181x_int8_sym_final.mlir --codegen="model_file=yolov5n_int8_fuse.cvimodel embed_debug_info=true model_version=latest" -o /dev/null [CMD]: model_runner.py --input yolov5n_in_ori.npz --model yolov5n_int8_fuse.cvimodel --output yolov5n_cv181x_int8_sym_model_outputs.npz setenv:cv181x Start TPU Simulator for cv181x device[0] opened, 4294967296 version: 1.4.0 yolov5n Build at 2024-11-11 19:37:51 For platform cv181x Cmodel: bm_load_cmdbuf Max SharedMem size:2457600 Cmodel: bm_run_cmdbuf device[0] closed [Running]: npz_tool.py compare yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_cv181x_int8_sym_tpu_outputs.npz --tolerance 0.99,0.90 --except - -vv compare 1249_f32: 88%|█████████████████████████████████████████████████████████▊ | 7/8 [00:00<00:00, 69.16it/s][964 ] EQUAL [PASSED] (1, 64, 80, 80) float32 [1081 ] EQUAL [PASSED] (1, 128, 40, 40) float32 [input.1 ] EQUAL [PASSED] (1, 256, 20, 20) float32 [1198 ] EQUAL [PASSED] (1, 256, 20, 20) float32 [1219_f32 ] EQUAL [PASSED] (1, 255, 80, 80) float32 [1234_f32 ] EQUAL [PASSED] (1, 255, 40, 40) float32 [1249 ] EQUAL [PASSED] (1, 255, 20, 20) float32 [1249_f32 ] EQUAL [PASSED] (1, 255, 20, 20) float32 8 compared 8 passed 8 equal, 0 close, 0 similar 0 failed 0 not equal, 0 not similar min_similiarity = (1.0, 1.0, inf) Target yolov5n_cv181x_int8_sym_model_outputs.npz Reference yolov5n_cv181x_int8_sym_tpu_outputs.npz npz compare PASSED. compare 1249_f32: 100%|██████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 27.38it/s] [Success]: npz_tool.py compare yolov5n_cv181x_int8_sym_model_outputs.npz yolov5n_cv181x_int8_sym_tpu_outputs.npz --tolerance 0.99,0.90 --except - -vv root@2a46fc75400f:/workspace/yolov5n_torch# scp -r /workspace/tpu-sdk root@192.168.42.1:/mnt/tpu/ root@192.168.42.1's password: OpenCVModules-release.cmake 100% 2053 402.1KB/s 00:00 haarcascade_eye.xml 100% 333KB 2.7MB/s 00:00 haarcascade_smile.xml 100% 184KB 2.7MB/s 00:00 .... libcvimath-static.a 100% 172KB 2.6MB/s 00:00 libcviruntime.so 100% 574KB 2.9MB/s 00:00 root@2a46fc75400f:/workspace/yolov5n_torch# scp /workspace/yolov5n_torch/yolov5n_int8_fuse.cvimodel root@192.168.42.1:/ mnt/tpu/tpu-sdk/ root@192.168.42.1's password: yolov5n_int8_fuse.cvimodel 100% 2158KB 2.9MB/s 00:00 root@2a46fc75400f:/workspace/yolov5n_torch# ls -l total 389176 drwxr-xr-x 2 root root 4096 Nov 11 19:21 COCO2017 -rw-r--r-- 1 root root 12398 Nov 11 19:37 _weight_map.csv -rwxr-xr-x 1 root root 14447400 Nov 9 09:00 best.pt -rwxr-xr-x 1 root root 40717 Oct 29 07:42 cat.jpg drwxr-xr-x 2 root root 4096 Nov 11 19:21 image drwxr-xr-x 5 root root 4096 Nov 7 14:12 train_data -rwxr-xr-x 1 root root 2524205 Nov 8 01:36 train_data.zip drwxr-xr-x 2 root root 4096 Nov 9 10:02 work -rw-r--r-- 1 root root 64711 Nov 11 19:23 yolov5n.mlir -rw-r--r-- 1 root root 8011 Nov 11 19:35 yolov5n_cali_table -rw-r--r-- 1 root root 2210112 Nov 11 19:37 yolov5n_int8_fuse.cvimodel

root@2a46fc75400f:/workspace/yolov5n_torch#

now using the best.pt

model_deploy.py \ --mlir yolov5n.mlir \ --quantize INT8 \ --calibration_table ./yolov5n_cali_table \ --chip cv181x \ --test_input ./cat.jpg \ --test_reference yolov5n_top_outputs.npz \ --compare_all \ --fuse_preprocess \ --debug \ --model yolov5n_int8_fuse.cvimodel

Thank you!

Are you willing to submit a PR?

[ ] Yes I'd like to help by submitting a PR!

UltralyticsAssistant commented 1 week ago

👋 Hello @tcpipchip, thank you for your interest in YOLOv5 🚀! It looks like you're working with the MILK-V 256, a RISC-V processor, and encountering a segmentation fault when running your exported model. No worries, we're here to help! 😊

For 🐛 Bug Reports like this, a minimum reproducible example is crucial, and you've done a great job in providing detailed steps and descriptions! This helps us understand the issue you're facing better. An Ultralytics engineer will review your report and assist you soon.

In the meantime, please verify you've set up your environment correctly:

Requirements

Ensure you have Python>=3.8.0 installed with all the relevant libraries from the requirements.txt and importantly, make sure you are using PyTorch>=1.8. To ensure everything is set up correctly, you might want to recreate your environment from scratch:

Clone the YOLOv5 repository.
Navigate into the cloned directory and install the necessary dependencies using the requirements file.

Environments

YOLOv5 runs smoothly in various environments like notebooks (Google Colab, Kaggle, etc.), cloud environments (Google Cloud, Amazon Web Services), or using Docker images with all dependencies pre-installed. Ensure your environment is up-to-date and configured correctly, including CUDA, cuDNN, Python, and PyTorch installations, particularly if you are leveraging GPU resources.

Debugging Tips

Double-check your export process, especially any parameters or configurations specific to the MILK-V setup.
Compare your Colab-trained model's configuration and preprocessing steps with those of the pre-trained yolov5n.pt as closely as possible.
If applicable, verify the Docker setup and try to eliminate any conflicts or errors present.

Stay tuned, and thank you for providing a comprehensive report! 📝 If there's anything else you can share about the exact error message or log outputs, feel free to add that information here. Our team is eager to assist you further! 🚀

tcpipchip commented 1 week ago

yea, requirements ok! Python and pytorch. Please, can you train my images and send your .pt version to me ?

pderrenger commented 1 week ago

@tcpipchip i'm sorry, but we can't provide private training services. However, you can follow our Train Custom Data guide to train your model. If you encounter issues, feel free to ask for help here.

tcpipchip commented 1 week ago

but have some tip about my problem ?

pderrenger commented 1 week ago

It seems like the issue might be related to the conversion process of your custom model. Ensure your model's architecture matches the pre-trained model you successfully converted, and double-check the conversion steps for any discrepancies.

tcpipchip commented 1 week ago

i am investiganting now if is the image size...and testing with other pre-trainned pt of thirdy party

pderrenger commented 1 week ago

Testing with different image sizes and pre-trained models is a good approach. Ensure that the input dimensions match those expected by the model, and verify compatibility with the latest YOLOv5 version. If issues persist, consider checking the model's architecture and conversion process for inconsistencies.

tcpipchip commented 6 days ago

Got it works, after 100 hours tryng https://milk-v.blogspot.com/2024/10/milk-v-yolov5-criando-dataset.html

pderrenger commented 6 days ago

@tcpipchip glad to hear you resolved your issue with YOLOv5 on the MILK-V TPU! For others who might encounter similar challenges with custom model deployment on TPU devices, I recommend checking our model export guide to understand the correct conversion steps and requirements for various hardware targets.

tcpipchip commented 5 days ago

Thanks. will add your link on the blog.

milk-v uses a export py to onnx, looks that is the same code of your company

pderrenger commented 5 days ago

Thank you for sharing your blog post! While we appreciate the mention, please note that YOLOv5's ONNX export functionality is open-source under the AGPL-3.0 license, as documented in our model export guide. We're glad you found the TPU deployment process helpful.

ultralytics / yolov5

yolov5 on milk-v tpu 256 #13411

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

YOLOv5 requirements

Usage: pip install -r requirements.txt

Base ------------------------------------------------------------------------

protobuf<=3.20.1 # https://github.com/ultralytics/yolov5/issues/8012

Logging ---------------------------------------------------------------------

tensorboard>=2.4.1

clearml>=1.2.0

comet

Plotting --------------------------------------------------------------------

Export ----------------------------------------------------------------------

coremltools>=6.0 # CoreML export

onnx>=1.10.0 # ONNX export

onnx-simplifier>=0.4.1 # ONNX simplifier

nvidia-pyindex # TensorRT export

nvidia-tensorrt # TensorRT export

scikit-learn<=1.1.2 # CoreML quantization

tensorflow>=2.4.0,<=2.13.1 # TF exports (-cpu, -aarch64, -macos)

tensorflowjs>=3.9.0 # TF.js export

openvino-dev>=2023.0 # OpenVINO export

Deploy ----------------------------------------------------------------------

tritonclient[all]~=2.24.0

Extras ----------------------------------------------------------------------

ipython # interactive notebook

mss # screenshots

albumentations>=1.0.3

pycocotools>=2.0.6 # COCO mAP

| preprocess: | | (x - mean) * scale | '-------------------------------------------------------' config Preprocess args : resize_dims : same to net input dims keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

| preprocess: | | (x - mean) * scale | '-------------------------------------------------------' config Preprocess args : resize_dims : same to net input dims keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

mean : [0.0, 0.0, 0.0] scale : [0.0039216, 0.0039216, 0.0039216]

config Preprocess args : resize_dims : [640, 640] keep_aspect_ratio : True keep_ratio_mode : letterbox pad_value : 0 pad_type : center

======================================================= * Dynamic Programming layer group with cluster

Consider redundant computation and gdma cost

Merge cut idx to reduce gdma cost

Are you willing to submit a PR?

Requirements

Environments

Debugging Tips