Closed THuffam closed 2 years ago
I noticed in the raccoon_detector.py that it says: # needs firmware from my fork with yolov3 support, see # https://github.com/sipeed/MaixPy/pull/451
How do I use that firmware? and how do I know if I'm using Yolo2 or Yolo3?
Thanks
1871kb should be fine. But yes, the model you trained is YOLOv3 - if you used master branch of axelerate, which you probably did. To train a model that can run on current, unmodified MaixPy you need to use legacy branch. Alternatively, try using binary and example script from here: https://drive.google.com/file/d/1q1BcWA8GiTQ_3Q9vYkSysRvGD62K2zh4/view?usp=sharing that is for YOLOv3.
Thank you very much for your response Dmitry.
I tried the binary you supplied but also got the out of memory error (E (68325830111) SYSCALL: Out of memory) when loading my model (but it did load your model ok - which is 300k smaller than mine).
Where can I get a legacy branch from - and would I need to build it myself? or is there a way to convert the model to v3 so I can use the mimimum (latest) version (which throws the following error if I try to load my model: ValueError: [MAIXPY]kpu: load error:2002, ERR_KMODEL_VERSION: only support kmodel V3).
I am looking at resizing the GC as per this article - will this affect it at run time?
Also, in the example you sent (person_detector_yolo3.py) there are the following 2 lines:
a = kpu.set_outputs(task, 0, 10, 8, 21) #the actual shape needs to match the last layer shape of your model(before Reshape)
a = kpu.set_outputs(task, 1, 20, 16, 21) #the actual shape needs to match the last layer shape of your model(before Reshape)
Where do I get these values? Is this something I can see when running axelerate in colab?
Many thanks again.
Hi! So I added an FAQ section to main readme https://github.com/AIWintermuteAI/aXeleRate#question-faq That should answer your question about legacy branch.
or is there a way to convert the model to v3 so I can use the mimimum (latest) version (which throws the following error if I try to load my model: ValueError: [MAIXPY]kpu: load error:2002, ERR_KMODEL_VERSION: only support kmodel V3).
You can use earlier version of nn-case and manually convert trained .tflite model from the project folder. Not sure what pitfalls there will be as I didn't use earlier versions of nncase for a while.
I am looking at resizing the GC as per this article - will this affect it at run time?
That can help you to get slightly more available RAM.
Where do I get these values? Is this something I can see when running axelerate in colab?
Yes, when you start training you can see the model summary. Check closed issues, there was a question about that.
Hope that was helpful. If everything works out for you, consider buying a beer/coffee for the project :) https://www.buymeacoffee.com/hardwareai
Thanks for this. Still struggling but I have been watching memory usage and noticed your kmodel file is only 874k whereas mine is 1871k... how did you get yours so small? Is there something I can do to make it smaller (I assume while training the model)?
I had to reduce the GC to 256k to be able to load the kmodel file.. but then I get out of heap memory trying to load an image (from SD card). It also runs out of memory when trying to use the camera.
Thanks again for your assistance.
I think I'm just using smaller feature extractor? MobileNet5_0 (this corresponds to mobilenet with alpha = 0.5).
Have used MobileNet5_0 which generates a much smaller kmodel (870k) - thanks for that tip. But testing it is proving difficult... When I use the binary you provide (in your mask detector example - maixpy.bin), when I plug in the maixpy device the screen shows "SDcard not mount, use flash". So I cannot try the static image example (person_detector_yolov3.py). When I try the camera example (person_detector_yolov3_cam.py), I get the following error:
code = kpu.run_yolo3(task, img)
[MAIXPY]kpu: img w=320,h=240, but model w=320,h=224
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: [MAIXPY]kpu: check img format err!
Any suggestions?
Yes. Exactly as it says in the error message, your model has input size of 224,320, but you want to feed it with the image from camera, that has size of 240,320. You need to either resize the camera image before feeding it to NN or train the model to take 240,320 image as input.
Oh, sorry I must have missed that. Have retrained and it now runs without throwing an exception!!
Now trying to get it to recognise the target objects is an issue.
When run using your demo code (person_detector_yolov3_cam.py) updated for my solution, it throws a lot of false positives and does not seem to recognise my target (a shark). So as your code says on the following line:
a = kpu.init_yolo3(task, 0.3, 0.3, 3, 2, anchor) #tweak the second parameter if you're getting too many false positives
I updated it to:
a = kpu.init_yolo3(task, 0.92, 0.3, 3, 2, anchor) #tweak the second parameter if you're getting too many false positives
This yeilds the least number of false positives but when it does show a result it is off - drawing the rectangle either above or below the object. Is there a reason for this or is it more likely that these are just false positives and it hasn't recognised the shark?
Will attach images..
Okay, so 0.3 confidence threshold is way too low for YOLOv3 model, that is correct. Now about bounding boxes in the wrong place - that could be a) wrong anchors b) upside down / mirrored image from camera or something along these lines. I recommend you do inference on static image with person_detector_yolov3.py and see if the result is right. As a side note, the inference from camera script was tested with Maix Go camera - and the parameters (rotation, vflip) in the script are for that board.
Thanks again.
I found turning the camera around 90 degrees improved the issue but the boxes were closer to the object, not surrounding it.
Same with static images - see below.
What do you recommend I do for changing the anchor parameters - I have just used the ones in your detector colab example (which are the same in your maixpy mask example)?
Okay. If the problem persists with static images, there must be something wrong with how it decodes the boxes. If you run inference on the computer on the same images, how do the boxes look?
HI, I'm training a bike recognition model right now. More than 5,000 pictures of the bike were collected and trained, and the trained model was downloaded to the K210 Flash. Running effect is still good, can be very accurate identification of all kinds of bikes, but there are some wrong identification. I have tried many times to adjust coord_scale, object_scale and no_object_scale parameters for training, but it is still easy to misidentify. I don't quite understand how to adjust these parameters. I just make repeated adjustments and observe the operation effect. I would like to know how these parameters should be adjusted, which parameters should be adjusted under what circumstances and how much is appropriate. Please help me, thank you!
@chaojunchi please create a new issue. You question has no relation to this issue.
The inference run on Colab (after training) yielded correct results :
Have not tried it on my PC (don't have a linux environment).
Yes, it does look correct in Colab. Can you paste the code you're running on device?
I have based both static and camera programs on your mask detector example:
Camera version (based on person_detector_yolov3_cam.py):
import sensor,image,lcd
import KPU as kpu
lcd.init()
lcd.rotation(2)
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.set_vflip(1)
sensor.run(1)
classes = ["shark"]
task = kpu.load(0x300000) #change to "/sd/name_of_the_model_file.kmodel" if loading from SD card
a = kpu.set_outputs(task, 0, 8, 10, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
a = kpu.set_outputs(task, 1, 16, 20, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
anchor = (0.76120044, 0.57155991, 0.6923348, 0.88535553, 0.47163042, 0.34163313,
0.33340788, 0.70065861, 0.18124964, 0.38986752, 0.08497349, 0.1527057)
a = kpu.init_yolo3(task, 0.94, 0.3, 3, 6, anchor) #tweak the second parameter if you're getting too many false positives
while(True):
img = sensor.snapshot()
code = kpu.run_yolo3(task, img)
if code:
for i in code:
a = img.draw_rectangle(i.rect(),color = (0, 255, 0))
a = img.draw_string(i.x(), i.y() - 22, classes[i.classid()], color=(255,0,0), scale=1.5)
a = lcd.display(img)
else:
a = lcd.display(img)
a = kpu.deinit(task)
Static image file from SD (based on person_detector_yolov3.py):
import sensor,image,lcd
import KPU as kpu
lcd.init()
lcd.rotation(2)
classes = ["shark"]
task = kpu.load(0x300000) #change to "/sd/name_of_the_model_file.kmodel" if loading from SD card
a = kpu.set_outputs(task, 0, 8, 10, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
a = kpu.set_outputs(task, 1, 16, 20, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
# anchor for yolo3:
anchor = (0.76120044, 0.57155991, 0.6923348, 0.88535553, 0.47163042, 0.34163313,
0.33340788, 0.70065861, 0.18124964, 0.38986752, 0.08497349, 0.1527057)
a = kpu.init_yolo3(task, 0.94, 0.3, 3, 2, anchor) #tweak the second parameter if you're getting too many false positives
pic_path = '/sd/img3_baseline_90.jpg'
img = image.Image(pic_path)
a = img.pix_to_ai()
code = kpu.run_yolo3(task, img)
print("Code:", code)
if code:
for i in code:
print(i)
a = img.draw_rectangle(i.rect(),color = (0, 255, 0))
a = img.draw_string(i.x(), i.y() - 22, classes[i.classid()], color=(255,0,0), scale=1.5)
a = lcd.display(img)
else:
a = lcd.display(img)
a = kpu.deinit(task)
Okay, so in that case there are a few things to keep in mind, since you're using a different board: 1) Parameters sensor.set_vflip(1) and lcd.rotation(2) are specific to the board I used for tests (Maix Go). 2) The anchors seem to be fine.
I'm really busy this week with work and studying, but on the weekend or early next week, I'll contact you for the follow-up. Will you be able to provide the model and training config? For me to test here if anything is wrong with YOLOv3 decoder.
Hi, I have just tried the various set_vflip and rotation parameters - with no change in result. Actually the resulting rectangle moves around as I move the camera - so not consistently offset in any particular direction or distance. Eg one moment it's very close on the left of the target, and when I move the camera slightly, the rectangle moves to the right or above the target,
I have been basing this on your aXeleRate_pascal20_detector colab example...
Here is my config:
config = {
"model":{
"type": "Detector",
"architecture": "MobileNet5_0",
"input_size": [240, 320],
"anchors": [[[0.76120044, 0.57155991], [0.6923348, 0.88535553], [0.47163042, 0.34163313]],
[[0.33340788, 0.70065861], [0.18124964, 0.38986752], [0.08497349, 0.1527057 ]]],
"labels": ['shark'],
"obj_thresh" : 0.7,
"iou_thresh" : 0.5,
"coord_scale" : 1.0,
"object_scale" : 3.0,
"no_object_scale" : 1.0
},
"weights" : {
"full": "",
"backend": "imagenet"
},
"train" : {
"actual_epoch": 50,
"train_image_folder": "/content/training_imgs",
"train_annot_folder": "/content/training_anns",
"train_times": 1,
"valid_image_folder": "/content/validation_imgs",
"valid_annot_folder": "/content/validation_anns",
"valid_times": 1,
"valid_metric": "recall",
"batch_size": 32,
"learning_rate": 1e-3,
"saved_folder": F"/content/drive/MyDrive/SharkSpotter/mobilenet50_yolov3",
"first_trainable_layer": "",
"augmentation": True,
"is_only_detect" : False
},
"converter" : {
"type": []
}
}
For training:
%matplotlib inline
from keras import backend as K
K.clear_session()
setup_inference(config, model_path)
For conversion:
from axelerate.networks.common_utils.convert import Converter
converter = Converter('k210', 'MobileNet5_0', '/content/validation_imgs')
converter.convert_model(model_path)
Any suggestions?
Would you like me to post the kmodel file?
Thanks for all your assistance Dmitri
Yes, please. Ideally the whole project folder, with tflite model and some validation samples. If it is not convenient to attach file here, email me at dmitrywat@gmail.com.
@THuffam could you also share some validation samples (without bounding boxes)?
I cannot reproduce the problem with the following code running on Maix Go
#tested with frimware 5-0.22
import sensor,image,lcd
import KPU as kpu
lcd.init()
lcd.rotation(2)
classes = ["shark"]
task = kpu.load('/sd/shark5_0_320x240.kmodel') #change to "/sd/name_of_the_model_file.kmodel" if loading from SD card
a = kpu.set_outputs(task, 0, 10, 8, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
a = kpu.set_outputs(task, 1, 20, 16, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
anchor = (0.76120044, 0.57155991, 0.6923348, 0.88535553, 0.47163042, 0.34163313,
0.33340788, 0.70065861, 0.18124964, 0.38986752, 0.08497349, 0.1527057)
a = kpu.init_yolo3(task, 0.5, 0.3, 3, 2, anchor) #tweak the second parameter if you're getting too many false positives
pic_path = 'shark1.jpg'
#pic_path = '0638.jpg'
#pic_path = '0414.jpg'
img = image.Image(pic_path)
#img = img.resize(320, 240)
a = img.pix_to_ai()
code = kpu.run_yolo3(task, img)
if code:
for i in code:
print(i)
a = img.draw_rectangle(i.rect(),color = (0, 255, 0))
a = img.draw_string(i.x(), i.y() - 22, classes[i.classid()], color=(255,0,0), scale=1.5)
a = lcd.display(img)
else:
a = lcd.display(img)
a = kpu.deinit(task)
and for live camera detection
import sensor,image,lcd
import KPU as kpu
lcd.init()
lcd.rotation(2)
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.set_vflip(1)
sensor.run(1)
classes = ["shark"]
task = kpu.load('/sd/shark5_0_320x240.kmodel') #change to "/sd/name_of_the_model_file.kmodel" if loading from SD card
a = kpu.set_outputs(task, 0, 10, 8, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
a = kpu.set_outputs(task, 1, 20, 16, 18) #the actual shape needs to match the last layer shape of your model(before Reshape)
anchor = (0.76120044, 0.57155991, 0.6923348, 0.88535553, 0.47163042, 0.34163313,
0.33340788, 0.70065861, 0.18124964, 0.38986752, 0.08497349, 0.1527057)
a = kpu.init_yolo3(task, 0.9, 0.3, 3, 2, anchor) #tweak the second parameter if you're getting too many false positives
while(True):
img = sensor.snapshot()
code = kpu.run_yolo3(task, img)
if code:
for i in code:
a = img.draw_rectangle(i.rect(),color = (0, 255, 0))
a = img.draw_string(i.x(), i.y() - 22, classes[i.classid()], color=(255,0,0), scale=1.5)
a = lcd.display(img)
else:
a = lcd.display(img)
a = kpu.deinit(task)
The bounding box is positioned nicely around the objects, both for static image and live inference. See below
Live inference video:
https://drive.google.com/file/d/19huc9efb3CyrZIhekZ5-rDZkz_qYGlIQ/view?usp=sharing
Since the original problem in title was resolved, I'll close this issue. Feel free to try making a reproducible example and open a new issue in that case :)
Agreed. Thanks Dmitry for all your help. Maybe the bounding box issue should be logged with Sipeed who make the MaixPy boards I'm using.
Can you recommend any other small boards as a replacement for these (it needs to be as small & light as possible as it is to be used in a drone)?
Thanks again Tim
On Tue, 12 July 2022, 17:07 AIWintermuteAI, @.***> wrote:
Closed #63 https://github.com/AIWintermuteAI/aXeleRate/issues/63 as completed.
— Reply to this email directly, view it on GitHub https://github.com/AIWintermuteAI/aXeleRate/issues/63#event-6975009274, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFRB6RHD7Y6NBOINHEPSNLVTUKUXANCNFSM5W3OCZJQ . You are receiving this because you were mentioned.Message ID: @.***>
Hi When loading a kmodel file - either from SD or flash the device (MaixPy Bit - new version with mic) crashes ( disconnects from IDE and LCD goes blank). This is at line:
task = kpu.load(0x300000)
The kmodel was generated using axelerate running on Google Colab. This was using firmware: maixpy_v0.6.2_78_g469c985e6_openmv_kmodel_v4_with_ide_support.bin
When I try without the IDE (firmware: maixpy_v0.6.2_78_g469c985e6_minimum_with_kmodel_v4_support.bin) , using the VS Code console and run the code line by line it shows the error (on kpu.load): E (77045384089) SYSCALL: Out of memory
I am using the code from raccoon_detector.py and using my kmodel file.
My kmodel file is 1871kb in size. How to work around this?