rainyl / opencv_dart

OpenCV bindings for Dart language and Flutter. Support Asynchronous Now!
https://pub.dev/packages/opencv_dart
Apache License 2.0
136 stars 18 forks source link

where is the python api : cv2.dnn.blobFromImage #169

Closed einsitang closed 3 months ago

einsitang commented 4 months ago

Read README carefully first Star :star: this project if you want to ask a question, no star, no answer already star 😁

Question

I use yolov8 tflite model on flutter , the python code work well, so I just try to find opencv python api cv2.dnn.blobFromImage in this package

einsitang commented 4 months ago

I just found it out 😓

rainyl commented 4 months ago

dart doesn't support namespace or 'export as', so nearly everything is under 'cv'.

einsitang commented 4 months ago

I'm sorry, but I still have problems with it

image

I'am just from python script to make code logic .

the mat is read 250px * 250px with 3 channel image, call the blobFromImage function

return the nmat with rows=-1,cols=-1, and channels = 1 ,

it should be 320px * 320px with 3 channel (rows=320,cols=320,channels=3) ?

rainyl commented 4 months ago

Just go ahead, but I just find the Mat.size is not correct, will be fixed in the next release.

einsitang commented 4 months ago

ok,got it, so the return blob is tensor shape (1,3,320,320) , but found 1228800 element ,is not correct too ? image

rainyl commented 4 months ago

@einsitang Mat.data is Uint8List, which is a wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#a4d33bed1c850265370d2af0ff02e1564 , but the actuall data type of your mat is float32, so the length of data is not exactly match the number of pixels, if you convert data to float32 list (via data.buffer.asFloat32List()), I believe it will be 1*3*320*320.

einsitang commented 4 months ago

thanks

I try later

abdelaziz-mahdy commented 4 months ago

an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?

einsitang commented 4 months ago

an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?

yolo model can export tflite file for tensorflow and that is official tool ,

my project with tflite_flutter and opencv_dart to rewrite python code to flutter .

I'am have not test performance yet , but yolo-flutter-app(yolo flutter example) detect work performance look like ok.

between tflite_flutter with opencv_dart and yolo-flutter-app , both use NativeLibrary to work , the main difference is preprocess and postprocess function working runtime. would be lower than yolo-flutter-app.

however, like 1~2 seconds to finish detect work (include preprocess and postprocess) is my expected

when logic done and basic work fine , I'll do more test tell you guys result

einsitang commented 3 months ago

sorry, I have another problem that I need your help.

I've found that python and dart execute the same opencv function and get different results, and I'm not sure if it's my use or if the difference actually occurs across platforms

python script dump json files:

import json,codecs
import urllib.request
url = 'https://i.mij.rip/2024/07/20/c43a166c11414cad2c019a3d814512dd.jpeg'
res = urllib.request.urlopen(url)
img = np.asarray(bytearray(res.read()),dtype="uint8")
img = cv2.imdecode(img,cv2.IMREAD_COLOR)

[height, width, _] = img.shape
length = max((height, width))

# #1 here is the origin image read from opencv (py_test_image.json)
print("json dump origin img")
json.dump(original_image.tolist(), codecs.open("py_test_image.json", 'w', encoding='utf-8'), 
              separators=(',', ':'))

image = cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114))
blob = cv2.dnn.blobFromImage(image, scalefactor=1 / 255, size=(640, 640), swapRB=True)

# #2 here is the data with blobFromImage (py_test_blob.json)
print("json dummp (blob)")
json.dump(blob.tolist(), codecs.open("py_test_blob.json", 'w', encoding='utf-8'), 
              separators=(',', ':'))

blob = blob[...,::-1].transpose((0,2,3,1)) # BCHW to BHWC

# #3 here is data with transpose((0,2,3,1)) BCHW to BHWC (py_test_blob_0231.json)
print("json dummp (blob_0231)")
json.dump(blob.tolist(), codecs.open("py_test_blob_0231.json", 'w', encoding='utf-8'), 
          separators=(',', ':'))

flutter dump json files:

final String dir = (await getExternalStorageDirectory())!.path;

cv.Mat originImgMat = cv.imread(await _copy(imgPath)); // BGR
var oWidth = originImgMat.width;
var oHeight = originImgMat.height;
var length = oWidth > oHeight ? oWidth : oHeight;

// #1 here is the origin image mat data json (dart_origin_img_mat.json)
final matData = jsonEncode(ListShape(originImgMat.data).reshape([oHeight,oWidth,3]));
final writedFile = await File(dir + "/dart_origin_img_mat.json");
await writedFile.writeAsString(matData);

cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat, 0, length-oHeight, 0, length-oWidth, cv.BORDER_CONSTANT,
        value: cv.Scalar(114, 114, 114, 0));

var blobMat = cv.blobFromImage(scaleImgMat,
        scalefactor: 1 / 255,
        size: (640, 640),
        swapRB: true,
        ddepth: cv.MatType.CV_32F);

// #2 here is the data with blobFromImage (dart_blob_mat.json)
final blobMatData = jsonEncode(ListShape(blobMat.data.buffer.asFloat32List()).reshape(inputTensor.shape));
final blobMatDataFile = await File(dir + "/dart_blob_mat.json");
await blobMatDataFile.writeAsString(blobMatData);
log.d("dart_blob_mat.json write done");

the "py_test_image.json"(python.#1) and "dart_origin_img_mat.json"(dart.#1) files is same (with same md5)

image

but the "py_test_blob.json"(python.#2) and "dart_blob_mat.json"(dart.#2) files data look like different

py_test_blob.json: py_test_blob.json

dart_blob_mat.json: dart_blob_mat.json

Apart from the shape, the data order seems to be different

einsitang commented 3 months ago

here is the files :

py_test_blob.json dart_blob_mat.json

and here is the image file: 4

and here is the image on network: https://i.mij.rip/2024/07/20/c43a166c11414cad2c019a3d814512dd.jpeg

einsitang commented 3 months ago

latest discovery

the cv.blobFromImage params : swapRB is not working?

I change the swapRB = false , but the dump json file is no difference image

image

pubspec.yaml lib version

opencv_dart: ^1.1.0+1
rainyl commented 3 months ago

@einsitang hello.

I tested it with your code and every thing is ok, the saved json from your python code is wrong.

import 'package:opencv_dart/opencv_dart.dart' as cv;

void main(List<String> args) {
  final originImgMat = cv.imread("c43a166c11414cad2c019a3d814512dd.jpeg", flags: cv.IMREAD_COLOR);
  var oWidth = originImgMat.width;
  var oHeight = originImgMat.height;
  var length = oWidth > oHeight ? oWidth : oHeight;

  cv.Mat scaleImgMat = cv.copyMakeBorder(
    originImgMat,
    0,
    length - oHeight,
    0,
    length - oWidth,
    cv.BORDER_CONSTANT,
    value: cv.Scalar(114, 114, 114, 0),
  );

  final blob = cv.blobFromImage(
    scaleImgMat,
    scalefactor: 1 / 255,
    size: (640, 640),
    swapRB: true,
  );
  print(blob.data.buffer.asFloat32List());
}
dart run test.dart > dart_blob.json
detailed images ![image](https://github.com/user-attachments/assets/4e932513-4f22-4022-854e-b4f2ec4b33c1) ![image](https://github.com/user-attachments/assets/5501a6fa-f773-4949-8e84-a914057908d1) ![image](https://github.com/user-attachments/assets/e0dc700d-0b26-45bc-b205-7ec071e15731)

I noticed

img = cv2.imdecode(img,cv2.IMREAD_COLOR)  // img
...
cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114)) // original_image

so the images you are processing are different, please check whether it's correct.

rainyl commented 3 months ago

And ignore that why I use print and stream redirection > to create dart_blob.json, I am lazy 😆

einsitang commented 3 months ago

no.... I cheked before, the "py_test_image.json"(python.#1) and "dart_origin_img_mat.json"(dart.#1) files with same md5 value, that mean originImgMat(dart) and original_image(python) is 100% the same data

img = cv2.imdecode(img,cv2.IMREAD_COLOR)  // img
...
cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114)) // original_image

img and original_image is same instance , just because I made a mistake copying the code,but only the show code , not the work code

here is the code with kaggle notebook , and work fine , this is why I'am follow python script to make flutter code logic

rainyl commented 3 months ago

Allright, then the only suggestion I can give you is to check the value of blob in python.

image

As you can see, your image starts with 255, and the padding values of left and top are all zeros, then you scaled it with scalefactor=1 / 255, so blob should starts with 1.0, however,

image

this shows it's started with the values <1.0, which is definitely incorrect.

einsitang commented 3 months ago

I just synchronization the python and flutter logic code , and ..... I'am make a mistake, the problem precisely is not about copyMakeBorder and blobFromImage , the previously reported problem was due to an error in the debugging process.

The problems I'm struggling with really don't happen here.

we can skip the topic about copyMakeBorder and blobFromImage ...

einsitang commented 3 months ago

the problem may on next step :

image

I don't know how do this on my flutter code , that could be the real problem of tensorflow with not correct input

einsitang commented 3 months ago

this shows it's started with the values <1.0, which is definitely incorrect.

yes,I think my dart code has changed and output json logs many times before, but I took a cache and didn't notice it, so it's not actually the same logic when compared to python

var dx = (length - oWidth) ~/ 2;
var dy = (length - oHeight) ~/ 2;
cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat, dy,dy,dx,dx, cv.BORDER_CONSTANT,
    value: cv.Scalar(114, 114, 114, 0));
// dump json
cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat,0,length-height,0,length-width, cv.BORDER_CONSTANT,
    value: cv.Scalar(114, 114, 114, 0));
// dump json
rainyl commented 3 months ago

I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input

Yes, it could be the problem.

It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether Mat.reshape will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions to Mat.reshape to see whether it works, again, I am not sure.

And note that there are many breaking changes in v1.2.0, you may need to change your code.

Edit: @einsitang take a look at https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/ kind of tricky but may works.

einsitang commented 3 months ago

thanks very much.

I found another lib tensor_dart , but just only can handle 2D matrix , that is why I skip this step to checked before step.

einsitang commented 3 months ago

I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input

Yes, it could be the problem.

It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether Mat.reshape will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions to Mat.reshape to see whether it works, again, I am not sure.

And note that there are many breaking changes in v1.2.0, you may need to change your code.

reshape and transpose have different effects

but this link may well good help : https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/ , I try it later

anyway, thanks again for your help

abdelaziz-mahdy commented 3 months ago

will that help ? @einsitang

Mat transposeBlob(Mat blob) {
  // Reshape the blob to split channels
  List<Mat> channels = [];
  split(blob, channels);
  // Rearrange the channels to match BHWC format
  Mat transposedBlob = Mat();
  merge(channels, transposedBlob);
  return transposedBlob;
}

edit: since transpose already exists i dont think this is correct

rainyl commented 3 months ago

I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input

Yes, it could be the problem. It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether Mat.reshape will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions to Mat.reshape to see whether it works, again, I am not sure. And note that there are many breaking changes in v1.2.0, you may need to change your code.

reshape and transpose have different effects

anyway, thanks again for your help

Yes, but

You can reshape (H, W, C) to (HW, C) and then perform a transpose. Transpose would give (C, HW) which can then be reshaped to (C, H, W).`

https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/

einsitang commented 3 months ago

yes , I try later , may have good news tell your guys

einsitang commented 3 months ago

Rearrange the channels to match BHWC format

I don't think so , because I'am not good math for linear algebra or tensor something

einsitang commented 3 months ago

@rainyl @abdelaziz-mahdy Thanks to your help, the logic of preprocess has been debugged and passed 🍾

abdelaziz-mahdy commented 3 months ago

@rainyl @abdelaziz-mahdy Thanks to your help, the logic of preprocess has been debugged and passed 🍾

For context how did you handle the reshape and transpose part, just In case someone needs to do the same part

einsitang commented 3 months ago

here is the code BCHW to BHWC

// upgrade opencv_dart 1.2.0 to use cv.Mat.reshapeTo
_chw2hwc(cv.Mat mat){
    final size = mat.size;
    // final b = size[0];   // B , dont' need it
    final c = size[1];       // C
    final h = size[2];      // H
    final w = size[3];     // W
    // C,HW
    cv.Mat c_hw = mat.reshapeTo(0,[c,h*w]);
    // HW,C and reshape in fn outside
    return c_hw.transpose();
  }

// ListShape with lib : tensor_dart 
var input = ListShape(_chw2hwc(blobMat).data.buffer.asFloat32List())
        .reshape(inputTensor.shape);
einsitang commented 3 months ago

an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?

My iphone version is too high to use xcode debugging, so I found a MI PAD 4 for testing. In general, the pre-processing time is about 350~400ms, the inference time is about 500ms, and the post-processing time is the longest, generally higher than 2s, but the hardware performance of this device is average. So I'm happy with that performance.

the latest test run on mi pad 4 (no quantization): preprocessTimes: 366.783 ms, inferenceTimes: 475.897 ms, postprocessTimes: 2113.832 ms

preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms

mi pad 4 CPU : Octa-core (4x2.2 GHz Kryo 260 Gold & 4x1.8 GHz Kryo 260 Silver) RAM : 4G kernel : Android 8.1.0 OS : MIUI 10.3.2.0

einsitang commented 3 months ago

I think I can close this issue , thank you very much

abdelaziz-mahdy commented 3 months ago

an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?

My iphone version is too high to use xcode debugging, so I found a MI PAD 4 for testing. In general, the pre-processing time is about 350~400ms, the inference time is about 500ms, and the post-processing time is the longest, generally higher than 2s, but the hardware performance of this device is average. So I'm happy with that performance.

the latest test run on mi pad 4 (no quantization): preprocessTimes: 366.783 ms, inferenceTimes: 475.897 ms, postprocessTimes: 2113.832 ms

mi pad 4 CPU : Octa-core (4x2.2 GHz Kryo 260 Gold & 4x1.8 GHz Kryo 260 Silver) RAM : 4G kernel : Android 8.1.0 OS : MIUI 10.3.2.0

interested to know why post process takes too much time, did you do it in dart? also try in release or profile mode, since its much faster for dart code due to compilation

einsitang commented 3 months ago

except inference part , all the code is in dart .

I review code again, is print log use wrong variable ... (another stupid mistake😂)

the real result is :

preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms

inferenceTimes takes most time , that well make sense

abdelaziz-mahdy commented 3 months ago

except inference part , all the code is in dart .

I review code again, is print log use wrong variable ... (another stupid mistake😂)

the real result is :

preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms

inferenceTimes takes most time , that well make sense

What do you inference with? Opencv? Dnn?

If you got open source I may help you with optimizing it

Also are these results on profile or release mode?

einsitang commented 3 months ago

except inference part , all the code is in dart . I review code again, is print log use wrong variable ... (another stupid mistake😂) the real result is : preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms inferenceTimes takes most time , that well make sense

What do you inference with? Opencv? Dnn?

inference with tensorflow lite

If you got open source I may help you with optimizing it

is opensource project , just for fun. if you want help me optimizing it , that is very good news for me but this feature is not code done , so I'am not push remote yet. you can star my repository , when code update ,then you can review code and help. but the repository from a long time ago , I'am chinese , english not good enough, comments is chinese and translate into english , may not good to read.

https://github.com/einsitang/sudoku-flutter

Also are these results on profile or release mode?

these results on debug mode , no try on release or profile yet.

einsitang commented 3 months ago

https://www.kaggle.com/code/jianbintangelement/sudoku-digits-detect-predict-with-yolov8

this is I'am try to do on my project

use camera on real time scan picture with sudoku puzzle , recognize them and solve them

einsitang commented 3 months ago

I have just finished the basic process of Sudoku detect. Thank you very much for your help. It's just that the current device I'm testing is not running very fast, a full model predicts that it will take 3s, I have two, so it will take at least 6s to complete a job. I also tried to quantify the optimization speed of the model, and the result was that the speed was doubled, but the accuracy was greatly reduced and basically unavailable, so I did not adopt the quantization scheme in the end. The problem that won't be quantified most likely is that my model is poorly trained

einsitang commented 3 months ago

the latest code is aready committed and push , If you are interested, I would appreciate your input

rainyl commented 3 months ago

I took a look at the impl of tflite_flutter, and I think their implementation is not efficient enough, https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L160-L176

if you pass a dart List to Interpreter.run, it will be copied 4 times at least.

  1. 2x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/util/byte_conversion_utils.dart#L36-L52
  2. 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L166
  3. 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L168

I didn't test the cost but I think it's unacceptable.

I suggest you to pass the Mat.data to interpreter.run() directly, this will avoid the first 2x copies, step further, I suggest you to open an issue in tflite_flutter to suggest them providing another api such as runWithTensor(Tensor tensor) to use the tensor created by users directly, then you can:

final Tensor t = interpreter.getInputTensor(0);
t.data = Mat.data; // only 1 copy, actually this will use setRange(), which will be faster than loops.
final result = interpreter.runWithTensor(t);
einsitang commented 3 months ago

I took a look at the impl of tflite_flutter, and I think their implementation is not efficient enough, https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L160-L176

if you pass a dart List to Interpreter.run, it will be copied 4 times at least.

  1. 2x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/util/byte_conversion_utils.dart#L36-L52
  2. 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L166
  3. 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L168

I didn't test the cost but I think it's unacceptable.

I suggest you to pass the Mat.data to interpreter.run() directly, this will avoid the first 2x copies, step further, I suggest you to open an issue in tflite_flutter to suggest them providing another api such as runWithTensor(Tensor tensor) to use the tensor created by users directly, then you can:

final Tensor t = interpreter.getInputTensor(0);
t.data = Mat.data; // only 1 copy, actually this will use setRange(), which will be faster than loops.
final result = interpreter.runWithTensor(t);

@rainyl your code review is very careful, I only understand how these libraries are used, I do not know the specific implementation logic. I followed the code you provided, and found that it is exactly what you said. what I'm not sure about, though, is that doing this will bypass tflite's type conversion, which I don't know enough about to know if cv.Mat and tflite.Tensor are consistent at the byte level, which could be a very unsafe operation

however,I can try your way to test. the code like :

final Tensor inputT = interpreter.getInputTensor(0);
inputT.data = Mat.data;  // not sure it work
interpreter.invoke();
final Tensor outputT = interpreter.getOutputTensor(0);
......

will try later and tell you result

rainyl commented 3 months ago

what I'm not sure about, though, is that doing this will bypass tflite's type conversion, which I don't know enough about to know if cv.Mat and tflite.Tensor are consistent at the byte level, which could be a very unsafe operation

Sure, in this way you need to be careful to ensure the types and size of Mat are match to Tensor, but Mat.data are just raw pixel values, I am not sure whether the underlying data of Tensor are raw values, if so the this should work. https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L69-L79

einsitang commented 3 months ago

great job , I originally thought that this part of the optimization would only provide a small performance improvement, but the modified test found that the speed was faster than I expected

this is recent performance logs,very high quality:

preprocessTimes:26.638 ms, postprocessTimes: 39.703 ms, inferenceTimes: 814.608 ms

preprocessTimes:20.872 ms, postprocessTimes: 77.102 ms, inferenceTimes: 816.108 ms

rainyl commented 3 months ago

Sounds good.

einsitang commented 3 months ago

我看了一下你的资料,你fork一些repo是全中文的,且姓liu,你应该是个中国人….我想我可以用国语向你致谢

rainyl commented 3 months ago

哈哈哈哈哈是的,没事儿🤣

einsitang commented 3 months ago

哈哈哈哈哈是的,没事儿🤣

方便问一下,你现在还是学生吗,就读武汉大学?