AIWintermuteAI / aXeleRate

Keras-based framework for AI on the Edge
MIT License
177 stars 71 forks source link

Can't load model trained with aXeleRate in Maix Go #5

Closed enrique-torres closed 4 years ago

enrique-torres commented 4 years ago

Hello, I've tried training multiple models with your software, but whenever I try to load them with Micropython in my Maix Go, the program just crashes when trying to load the model, and it doesn't give any information as to why it crashed. I tried loading the pretrained 20 class model and it works, so I know there's no problem with the board or the SD card from which I'm loading the model. I also thought about the possibility of the model size being too big, but the pretrained model is 1MB in size, and mine was 900kB. I've tried training with MobileNet7_5 and Tiny Yolo (although the latter yields a bigger file size of 2.3MB), and I haven't been able to load either of them. I'm wondering if I'm doing something wrong or configuring something wrong, but I'm stuck and I can't find much info on training a model for this device apart from your repo.

Thanks in advance.

ikovnatsky commented 4 years ago

Same issue maybe its the way we are running nncase not much documentation! On the other hand if i use nncase as predict the output is good

AIWintermuteAI commented 4 years ago

@ikovnatsky @enrique-torres What firmware version are you using? Compiled from source or downloaded from df.sipeed?

ikovnatsky commented 4 years ago

tried both same issue. (if you can send me a sample .kmodel) i can try. to isolate where the issue is in my model or in my firmware. Thanks Ilya

enrique-torres commented 4 years ago

Downloaded from Sipeed, I think it's version 0.5.0_32 with IDE support

AIWintermuteAI commented 4 years ago

@enrique-torres @ikovnatsky I did some tests just now. Before I was testing my models with firmware version 5-0.22 as indicated in example files here https://github.com/AIWintermuteAI/aXeleRate/blob/85bf41aba310c62f7e76c53076c498bda3a75416/example_scripts/k210/detector/person_detector_v4.py#L1 When I compiled latest firmware from Sipeed (https://github.com/sipeed/MaixPy/commit/06e16a3ea47871f711f895b1b1c8f175397d0fec) I was able to reproduce the problem that you have mentioned(crash on kpu.load). However when tried with earlier firmware versions(up to 5-0.29, commit https://github.com/sipeed/MaixPy/commit/97fad3a27b72b5673587fdb5dc7f148ec7dec337) I was able to run .kmodels generated by aXeleRate successfully, both classifier and detector models(MobileNet backends). Can you try running your model with 5-0.29 with IDE support and see if problem persists? No need to compile from source, downloading pre-compiled binaries from dl.sipeed is okay.

AIWintermuteAI commented 4 years ago

Here is a sample model I tested this morning, a person detector 2020-04-27_11-12-19.zip

ikovnatsky commented 4 years ago

that seems to have done the trick. If i can find my ocd cable i can see why it crashes. Thank you for the help

AIWintermuteAI commented 4 years ago

No problem. If you connect to the board over serial and copy paste your code, it would give you a core dump(if you just connect with IDE and press run, you cannot see the core dump). Might be a good idea to post to on MaixPy Github.

enrique-torres commented 4 years ago

I'll try this then. I was wondering if you could point me in the correct direction for training a model that detects humans, dogs and cats. How could I configure it to train only for those classes?

AIWintermuteAI commented 4 years ago

Please test if the issue was resolved by running model with 5-0.29. If it does resolve the problem, remember to reply here and close the issue - this way it is easier for other users to debug if they encounter similar problem.

For your other question, let's keep it "one problem - one issue", this is standard conduct of behavior on Github :) If you have other problems related to aXeleRate, create a separate issue. Alternatively, you can PM me on LinkedIn or post a comment on YouTube, I have 0 comments unanswered policy. Cheers!

ikovnatsky commented 4 years ago

5-0.29 works for me

enrique-torres commented 4 years ago

Neither 0.5.0-22 nor 0.5.0-29 work for me. It might be some issue with how I'm training, so I'm going to close this issue and open a new one

AIWintermuteAI commented 4 years ago

Have you tried the model I posted above? Did that one work normally with 0.5.0-29 or also crashed? Because if the model above crashed for you with 0.5.0-29 that might indicate a)hardware problem b)problem with model/firmware flashing or loading

enrique-torres commented 4 years ago

Yes, I did try that model and it worked, sorry for not mentioning it in the post. I didn't quite understand why my models don't load, but that one does. That's why I opened the other issue, because it might have to do something with the config when I train with my dataset

AIWintermuteAI commented 4 years ago

Hm... Normally it shouldn't. Are you willing to share the dataset/config? Perhaps you can share the colab notebook using direct messages. That would offer the way to see what's the source of your problem. This is of course is totally optional.

ikovnatsky commented 4 years ago

Check your config file to make sure that its set for K210 and not tflite.

enrique-torres commented 4 years ago

Never mind. Updating aXeleRate to the Github version seems to have fixed the issue. I thought I was on the latest version, but I think I executed pip install axelerate instead of pip install git+github link. Now I'm able to run the model, but it doesn't recognize anything

AIWintermuteAI commented 4 years ago

Yes. It is recommended to keep up with current master branch and not version from pip during active development period. When I get to 1.0 that will change :) Was model able to detect the objects after training? You can check with infer.py

enrique-torres commented 4 years ago

No, it doesn't detect objects with the .h5 weights file. The dataset is quite big, so I don't think it has to do with precision. It might have to do with the config. Plus, I've only been able to train 1 epoch. I think I should open another issue for this though

AIWintermuteAI commented 4 years ago

Oh, okay. One epoch is not anywhere enough indeed. For some datasets, I train for 20 epochs just to see mAP starting to rise from 0. Finally can consider this issue closed :)

AIWintermuteAI commented 4 years ago

https://github.com/sipeed/MaixPy/issues/242#issuecomment-633937461