sipeed / maix_train

k210(MaixPy)/V831 model example train code, include mobilenet classifier and YOLO V2 detector
https://wiki.sipeed.com/maixpy
Apache License 2.0
84 stars 40 forks source link

Convert to AWNN online #16

Open diazGT94 opened 2 years ago

diazGT94 commented 2 years ago

I trained a custom yoloV2 detector with 2 classes using the tools provided in this repository, but I've to modify the input size of the image to 416x416 to get better results for one of the classes during the training. Then I used the export.py script to convert my file into ONNX, which according to the logs reported it was built successfully, the same script also convert my file from ONNX to NCNN, which and gave me two files (.bin and .param). For some reason when I uploaded my files to the AWNN online converter tool I got the following error. The error is caused because when the input size of the image?

image

Here is a description of what my .param file looks and the structure of the folder I'm trying to upload to the converter.

image


Input            input0                   0 1 input0 0=416 1=416 2=3
Convolution      Conv_0                   1 1 input0 148 0=32 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=864
ReLU             LeakyRelu_1              1 1 148 110 0=1.000000e-01
Convolution      Conv_2                   1 1 110 151 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=9216
ReLU             LeakyRelu_3              1 1 151 113 0=1.000000e-01
Convolution      Conv_4                   1 1 113 154 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=18432
ReLU             LeakyRelu_5              1 1 154 116 0=1.000000e-01
Convolution      Conv_6                   1 1 116 157 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=36864
ReLU             LeakyRelu_7              1 1 157 119 0=1.000000e-01
Convolution      Conv_8                   1 1 119 160 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=73728
ReLU             LeakyRelu_9              1 1 160 122 0=1.000000e-01
Convolution      Conv_10                  1 1 122 163 0=128 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=147456
ReLU             LeakyRelu_11             1 1 163 125 0=1.000000e-01
Convolution      Conv_12                  1 1 125 166 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=294912
ReLU             LeakyRelu_13             1 1 166 128 0=1.000000e-01
Convolution      Conv_14                  1 1 128 169 0=256 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=589824
ReLU             LeakyRelu_15             1 1 169 131 0=1.000000e-01
Convolution      Conv_16                  1 1 131 172 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1179648
ReLU             LeakyRelu_17             1 1 172 134 0=1.000000e-01    
Convolution      Conv_18                  1 1 134 175 0=512 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_19             1 1 175 137 0=1.000000e-01
Convolution      Conv_20                  1 1 137 178 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_21             1 1 178 140 0=1.000000e-01
Convolution      Conv_22                  1 1 140 181 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_23             1 1 181 143 0=1.000000e-01
Convolution      Conv_24                  1 1 143 184 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_25             1 1 184 146 0=1.000000e-01
Convolution      Conv_26                  1 1 146 output0 0=35 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=17920```
JuanDavidBarrero commented 2 years ago

Could you help me? I don't know what to put in ImageSets/Main/train.txt and ImageSets/Main/val.txt, i leaved it empty but this error pops up

Loading the pretrained model ...
Loading the darknet_tiny ...
2021-12-01 20:10:26.889082: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-12-01 20:10:27 - [DEBUG] - [MainProcess - MainThread] Falling back to TensorFlow client; we recommended you install the Cloud TPU client directly with pip install cloud-tpu-client.
2021-12-01 20:10:28 - [INFO] - [MainProcess - MainThread]  check dataset in train
Traceback (most recent call last):
  File "train.py", line 193, in <module>
    train.load_dataset(F"detectore/datasets/{dataset_name}", load_num_workers = 16)
  File "train.py", line 78, in load_dataset
    transform = SSDAugmentation(size=(self.input_shape[2], self.input_shape[1]), mean=(0.5, 0.5, 0.5), std=(128/255.0, 128/255.0, 128/255.0))
  File "/content/maix_train/pytorch/detector/dataset.py", line 104, in __init__
    if i % int(len(lines) * 0.05) == 0:
ZeroDivisionError: integer division or modulo by zero
diazGT94 commented 2 years ago

@JuanDavidBarrero You should put the name of the images that you'll use for training and for validation. For example I used a dataset of 600 images and the name of the image was a number. E.g 1.jpg, 2.jpg, ... , 600.jpg. Then with a python script I randomly distributed the numbers into train.txt and val.txt

JuanDavidBarrero commented 2 years ago

Thanks, it works !

diazGT94 commented 2 years ago

@JuanDavidBarrero where you able to convert your trained model using the online converter tool?

JuanDavidBarrero commented 2 years ago

I tried but in the end got these results.

yolo yolo2

I don't know why this always throws me an error, And also when I run the example of yolo given by them with the number model, the maix dock II restarts

Have you been able to execute it?

diazGT94 commented 2 years ago

@JuanDavidBarrero No because I can even upload my .zip folder to the converter. I mailed the sipeed guys couple of days ago regarding this issue and they told me they are working on an offline converter tool. May I ask what are the parameters of your model? How is the structure of the zip folder your trying to submit? Which pytorch version you used for train your model?

JuanDavidBarrero commented 2 years ago

@diazGT94 This is the structure in the zip file that I uploaded, in the images folder I put some of the validation images, not all of them, or the zip gets too heavy to upload

cards

this is the .param file i worked with 224x224 images

7767517
28 28
Input            input0                   0 1 input0 0=224 1=224 =2=3
Convolution      Conv_0                   1 1 input0 148 0=32 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=864
ReLU             LeakyRelu_1              1 1 148 110 0=1.000000e-01
Convolution      Conv_2                   1 1 110 151 0=32 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=9216
ReLU             LeakyRelu_3              1 1 151 113 0=1.000000e-01
Convolution      Conv_4                   1 1 113 154 0=64 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=18432
ReLU             LeakyRelu_5              1 1 154 116 0=1.000000e-01
Convolution      Conv_6                   1 1 116 157 0=64 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=36864
ReLU             LeakyRelu_7              1 1 157 119 0=1.000000e-01
Convolution      Conv_8                   1 1 119 160 0=128 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=73728
ReLU             LeakyRelu_9              1 1 160 122 0=1.000000e-01
Convolution      Conv_10                  1 1 122 163 0=128 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=147456
ReLU             LeakyRelu_11             1 1 163 125 0=1.000000e-01
Convolution      Conv_12                  1 1 125 166 0=256 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=294912
ReLU             LeakyRelu_13             1 1 166 128 0=1.000000e-01
Convolution      Conv_14                  1 1 128 169 0=256 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=589824
ReLU             LeakyRelu_15             1 1 169 131 0=1.000000e-01
Convolution      Conv_16                  1 1 131 172 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=1179648
ReLU             LeakyRelu_17             1 1 172 134 0=1.000000e-01
Convolution      Conv_18                  1 1 134 175 0=512 1=3 11=3 2=1 12=1 3=2 13=2 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_19             1 1 175 137 0=1.000000e-01
Convolution      Conv_20                  1 1 137 178 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_21             1 1 178 140 0=1.000000e-01
Convolution      Conv_22                  1 1 140 181 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_23             1 1 181 143 0=1.000000e-01
Convolution      Conv_24                  1 1 143 184 0=512 1=3 11=3 2=1 12=1 3=1 13=1 4=1 14=1 15=1 16=1 5=1 6=2359296
ReLU             LeakyRelu_25             1 1 184 146 0=1.000000e-01
Convolution      Conv_26                  1 1 146 output0 0=30 1=1 11=1 2=1 12=1 3=1 13=1 4=0 14=0 15=0 16=0 5=1 6=15360

finally this is the version of pytorch that i used

>>>import torch
>>>print(torch.__version__)
1.10.0+cu111