Closed jmtrutna closed 2 years ago
I wound up finding my way initially by studying and extrapolating from this:
it was not the simplest matter to come to terms with it but after a while it made sense-- One thing you need to do is to understand the shape of your input and output tensors, if you haven't yet.
Thanks for the tip. Have you successfully gotten your YOLOv4 model running with this plugin? (I'm assuming yes, just wanted to make sure). For the time being I have just been using the tflite plugin with YOLOv2, however YOLOv4 gave me a 13% better mAP so I will definitely dive back into figuring it out. I appreciate the response.
I do have it working. I'll have some time to come back here later this week and try to outline more of it
Thank You.
@jmtrutna apologies for the lag, been a hectic week and will be for the next couple. Planning to dip back here every few days if you still want help and offer whatever (little) I can. Sunday is likely when I can give a rough outline--
Meantime quick question-- are you using the helper library that goes with this library? If not, check it out, any advice I offer here will rely on that. I'm just planning on outlining how you have to prep an image before you run predictions on it, how you send your labels and bounding box format into the interpreter (as outputs) along with the pre-processed image (input), and how you have to go about doing post-processing on the raw data you get back from a Yolo model. You get a ridiculous amoung of predictions and you have to sift through them. It's that part especially that I wound up culling from the Java app I linked above--
Here's one thing now:
TensorImage getProcessedImage(TensorImage inputImage) {
padSize = max(inputImage.height, inputImage.width);
imageProcessor ??= ImageProcessorBuilder()
.add(ResizeWithCropOrPadOp(padSize!, padSize!))
.add(ResizeOp(INPUT_SIZE.toInt(), INPUT_SIZE.toInt(), ResizeMethod.NEAREST_NEIGHBOUR))
.add(NormalizeOp(1, 255))
.build();
inputImage = imageProcessor!.process(inputImage);
return inputImage;
}
So again you have to prep your image/s before they can work with the Yolo model. These operations ^ are being called and run by the helper lib. It's super helpful. You could do all those things yourself, but-- The idea, in any case, is to match the INPUT_SIZE to the image size that your model wants. For me it was 416. You might have gone smaller or larger.
The images need to be padded, also. That's the first step. Image padding fills out the input image so it's squared-- using 0s (I believe?) to fill the excess space in the shorter direction. So like if your image is, randomly, 500 wide by 1000 tall, you'd wind up adding padding on either side of the centered width. The function does that, you just have to set up the padding amount-- so that's what the padSide = max(etc) means.
Then you resize it cleanly to the INPUT_SIZE.
Then you normalize the pixels values into a standard range, which should be the same for you, 1, 255, like that.
You'll wind up making an input to your model from the buffer from that image. You also have to set up the interpreter so you can pass it to the Classifier along with your labels-- again, the labels wind up being outputs along with the bounding box arrangement that the Yolo model will output.
So that's a somewhat random place to start.
Realize that you can follow the example app. But what we're doing with Yolo requires a different image prep and a significantly different post production stage. Yolo gives messier output than the model he's using there.
Let me know if this is tedious or beside the point. Otherwise I'll be back--
Thanks for taking the time. I really do appreciate it. I am using the tflite_flutter_helper plugin. I am curious, did you change any of the code inside of the actual plugins themself (tflite_flutter_plugin or tflite_flutter_helper plugin)? I am going to try and make some progress on this this week. But just to provide some more context. I'm running the detection on a png image that I crop from the camera preview through a viewfinder. I then resize it with the image package to 416 x 416, (although I guess this is unnecessary as you have demonstrated above does this for me). I don't actually need to display the bounding boxes for the detections, however I do use this information to sort the results from left to right.
In case you find it useful the properties of my tflite file are below (using https://netron.app/)
And below are the input and output details for when I run my model in python:
INPUT DETAILS [{'name': 'input_1', 'index': 0, 'shape': array([ 1, 416, 416, 3], dtype=int32), 'shape_signature': array([ -1, 416, 416, 3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}] OUTPUT DETAILS [{'name': 'Identity', 'index': 233, 'shape': array([ 1, 2535, 4], dtype=int32), 'shape_signature': array([ 1, -1, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 212, 'shape': array([ 1, 2535, 62], dtype=int32), 'shape_signature': array([ 1, -1, 62], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
Hi again. No I didn't change any code inside the plugins. It was a matter of getting the inputs and outputs formatted correctly and then coming to terms with the post processing. Each little section of an image winds up genarating a prediction-- the vast majority of them being useless. So you have to sift down, filtering by confidence percentage, and then you wind up sifting again, and finally running a non max supression which you can copy from the internet. The NMS is needed because adjacent sections of an image that are super close to each other will all likely predict the right image so that you wind up with several 'correct' guesses. It was tricky for me to grasp, at first, that you need to offset the output to reach the different predictions, which come out in a long (very long) stream of data.
My intention here had been to outline my process here, tonight. However -- as I get closer to launching my app, of which the recognitions make an aspect, I've determined that I'm using way too memory when the TFLite runtime launches: crashing slower phones and probably not working efficiently enough even on the faster ones that seem able to handle it, and on which I'd done most of my physical testing up till now. So although I get good predicitons and it 'works,' it seems more responsible to hold off answering you here, for a sec, until I sort out how to fix my situation. Once I sort that out (ideally this week, ha) I'll come back and offer an actual outline, with apologies again for the lag/s.
Meantime-- you really can understand how it all goes from that sample app, written in a different language (with different array and tensor formats, note) than we're using here. If you dive into that and wind up with a question feel free to ask.
Otherwise, talk more when I sort out my issue and am in a better position to do this-- again, hopefully before too long.
Hey I haven't forgot about this. My issue is almost sorted. Bad thread management nothing to do with TF persay. Anyway you have a model that recognizes 62 objects? And you're using a Tiny Yolo model, it seems? Or am I read this wrong--
I am using a tiny model, however I'm not married to that idea and it's far from perfect... I'm still collecting data (my dataset is only ~11,000 images right now, and I think I read that around 2,000 instances per class should yield a solid model... so... lots of work to do lol. we'll see how many images it really takes before I'm happy with it). But I could retrain it as a full sized YOLOv4 model no problem. That works fine on the devices you have been testing?
Looking through and running TexMexMax's classifier-yolov4tiny-tflite.dart (linked in my original question) on my model I did discover that I was sending my processed image as a uint8 instead of float32. So now when I print inputImage.tfLiteType I get back TfLiteType.float32, so I think I'm sending the correct input now.
When I print the output shape I get: flutter: Outputshapes: flutter: [[1, 2535, 4], [1, 2535, 62]]
Looking at my output shape in python I get [[ 1, 2535, 4], [ 1, 2535, 62]], (the same as above), however I'm looking to find my shape signature array of [[ 1, 1, 4], [ 1, 1, 62]] in flutter to see if that lines up.
Then we get to the list of results... ha. It is long indeed (as you warned) and fully of empty lists. I will go through the YOLOV4Classifier.java that you linked me above, however I am hoping to get a long list of correct results first.
Thanks and good luck on your app!
Update: I changed the image_conversions.dart inside of the tflite_flutter_helper package to this: https://github.com/TexMexMax/tflite_flutter_helper/blob/master/lib/src/image/image_conversions.dart
I am now getting results, but need to filter them. Edit: the results aren't as good as in python, but they are in the right ballpark for the most part.
Thanks for the luck & I'm glad you're making progress. I'm still leaking memory as it turns out. Perhaps you have some idea for me in this case... Opened a new issue about it.
Anyway I am getting good results (leak aside) and will be happy to return here if I can find my way through.
Meantime good luck to us both--
Keep me posted, because I tried running two models; loading and running and closing model 1 and then loading and running and closing model 2. In the other tflite plugin I could do that about 6 times before crashing, and in this plugin it was more like a dozen. I suspected that there was an issue with not releasing all of the memory when closing the model. I'll keep an eye on your post, and let me know if you find a solution.
Edit: By the way the crashing has nothing to do with running the model, as I've tried just loading and unloading the two models without running them, and still it crashes.
Indeed. I'm looking into building TFLite myself. Not sure that will have any bearing but I'm not sure what else to do at the moment.
I found a very recent issue in the TensorFlow Github issues about building on XCode 13 causing memory leaks-- but for one, they patched it last week, and for two, I ran on XCode 12.5 this morning just to check, and got the same results.
I also ran the same situation (using the Counter App, just opening and closing the intepreter) with the stock SSD Mobilenet model that goes with the example app for this lib. Same thing happens...
Anyway I'll let you know if / when I figure this out. Please do the same if you do. I'll just note things in that other issue re: this from now on.
Update: Up and running. TexMexMax's code worked great. The mistakes that I made (incase anybody else runs into some issues) are to make sure when you alter the tflite_flutter_helper code, make sure you delete the example inside of the package. I didn't do this originally and it was causing my app to crash. Also, in order to get the non-max suppression results you have to alter the recognition.dart file.
Congrats. So you're not leaking memory now though?
I haven't checked that sorry. One of the reasons my model has so many classes is I combined two models into one, so I wouldn't have to load and reload... I have run this model dozens of times in a row without it crashing though.
I'll check but I don't see where to look for leaks (xcode 12.4) Edit: Yup. getting memory leaks. If I figure anything out I'll post on your question.
It looks like I'm getting a memory leak loading and closing my model. It is 24 MB by the way. In an unrelated matter I'm thinking of retraining as a full yolov4 model instead of the tiny version. Do you know the best applications for each?
On Thu, Jan 13, 2022 at 10:51 AM Elliot P @.***> wrote:
So you're not leaking memory now?
— Reply to this email directly, view it on GitHub https://github.com/am15h/tflite_flutter_plugin/issues/158#issuecomment-1012414516, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQDFNGJSQOZVKKSNWNPEJYDUV4NKHANCNFSM5HE3TA7Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
hey guys, I hope you are well .. I am using TexMexMax's code to make detector app .. first, my dataset is only 1400 image and just one class.. I'm trained my model on yolo v4 "although the accuracy is not good enough :( " .. however, I tried to make like TexMexMax's code and tried to alter image_conversions.dart in tflite_flutter_helper 0.3.1 plugin but I get errors about this plugin .. can you help me please ? I have no much time :( I hope you can reply as soon as possible ..
I'm using the example code for the object detection app. Is there anything else in the Classifier file code that needs to be changed for different models besides MODEL_FILE_NAME, LABEL_FILE_NAME, and INPUT_SIZE?
Edit: The model is YOLOv4Tiny and works fine running it in python. I am using TexMexMax's code below as a reference. In the ReadMe section he says to change the image_conversion.dart file in the tflite_flutter_helper package (0.3.0 for me) and after doing that my build breaks. Changing back to an unaltered tflite_flutter_helper 0.2.0 I can run the model, but the results are incorrect.
https://github.com/TexMexMax/object_detection_flutter