Closed einsitang closed 3 months ago
I just found it out 😓
dart doesn't support namespace or 'export as', so nearly everything is under 'cv'.
I'm sorry, but I still have problems with it
I'am just from python script to make code logic .
the mat
is read 250px * 250px with 3 channel image, call the blobFromImage
function
return the nmat
with rows=-1,cols=-1, and channels = 1 ,
it should be 320px * 320px with 3 channel (rows=320,cols=320,channels=3) ?
Just go ahead, but I just find the Mat.size is not correct, will be fixed in the next release.
ok,got it, so the return blob is tensor shape (1,3,320,320) , but found 1228800 element ,is not correct too ?
@einsitang Mat.data is Uint8List, which is a wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#a4d33bed1c850265370d2af0ff02e1564 , but the actuall data type of your mat is float32
, so the length of data is not exactly match the number of pixels, if you convert data to float32 list (via data.buffer.asFloat32List()
), I believe it will be 1*3*320*320
.
thanks
I try later
an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?
an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?
yolo model can export tflite file for tensorflow and that is official tool ,
my project with tflite_flutter and opencv_dart to rewrite python code to flutter .
I'am have not test performance yet , but yolo-flutter-app(yolo flutter example) detect work performance look like ok.
between tflite_flutter
with opencv_dart
and yolo-flutter-app
, both use NativeLibrary to work , the main difference is preprocess
and postprocess
function working runtime. would be lower than yolo-flutter-app
.
however, like 1~2 seconds to finish detect work (include preprocess and postprocess) is my expected
when logic done and basic work fine , I'll do more test tell you guys result
sorry, I have another problem that I need your help.
I've found that python and dart execute the same opencv function and get different results, and I'm not sure if it's my use or if the difference actually occurs across platforms
python script dump json files:
import json,codecs
import urllib.request
url = 'https://i.mij.rip/2024/07/20/c43a166c11414cad2c019a3d814512dd.jpeg'
res = urllib.request.urlopen(url)
img = np.asarray(bytearray(res.read()),dtype="uint8")
img = cv2.imdecode(img,cv2.IMREAD_COLOR)
[height, width, _] = img.shape
length = max((height, width))
# #1 here is the origin image read from opencv (py_test_image.json)
print("json dump origin img")
json.dump(original_image.tolist(), codecs.open("py_test_image.json", 'w', encoding='utf-8'),
separators=(',', ':'))
image = cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114))
blob = cv2.dnn.blobFromImage(image, scalefactor=1 / 255, size=(640, 640), swapRB=True)
# #2 here is the data with blobFromImage (py_test_blob.json)
print("json dummp (blob)")
json.dump(blob.tolist(), codecs.open("py_test_blob.json", 'w', encoding='utf-8'),
separators=(',', ':'))
blob = blob[...,::-1].transpose((0,2,3,1)) # BCHW to BHWC
# #3 here is data with transpose((0,2,3,1)) BCHW to BHWC (py_test_blob_0231.json)
print("json dummp (blob_0231)")
json.dump(blob.tolist(), codecs.open("py_test_blob_0231.json", 'w', encoding='utf-8'),
separators=(',', ':'))
flutter dump json files:
final String dir = (await getExternalStorageDirectory())!.path;
cv.Mat originImgMat = cv.imread(await _copy(imgPath)); // BGR
var oWidth = originImgMat.width;
var oHeight = originImgMat.height;
var length = oWidth > oHeight ? oWidth : oHeight;
// #1 here is the origin image mat data json (dart_origin_img_mat.json)
final matData = jsonEncode(ListShape(originImgMat.data).reshape([oHeight,oWidth,3]));
final writedFile = await File(dir + "/dart_origin_img_mat.json");
await writedFile.writeAsString(matData);
cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat, 0, length-oHeight, 0, length-oWidth, cv.BORDER_CONSTANT,
value: cv.Scalar(114, 114, 114, 0));
var blobMat = cv.blobFromImage(scaleImgMat,
scalefactor: 1 / 255,
size: (640, 640),
swapRB: true,
ddepth: cv.MatType.CV_32F);
// #2 here is the data with blobFromImage (dart_blob_mat.json)
final blobMatData = jsonEncode(ListShape(blobMat.data.buffer.asFloat32List()).reshape(inputTensor.shape));
final blobMatDataFile = await File(dir + "/dart_blob_mat.json");
await blobMatDataFile.writeAsString(blobMatData);
log.d("dart_blob_mat.json write done");
the "py_test_image.json"(python.#1
) and "dart_origin_img_mat.json"(dart.#1
) files is same (with same md5)
but the "py_test_blob.json"(python.#2
) and "dart_blob_mat.json"(dart.#2
) files data look like different
py_test_blob.json:
dart_blob_mat.json:
Apart from the shape, the data order seems to be different
here is the files :
py_test_blob.json dart_blob_mat.json
and here is the image file:
and here is the image on network: https://i.mij.rip/2024/07/20/c43a166c11414cad2c019a3d814512dd.jpeg
latest discovery
the cv.blobFromImage params : swapRB
is not working?
I change the swapRB = false
, but the dump json file is no difference
pubspec.yaml lib version
opencv_dart: ^1.1.0+1
@einsitang hello.
I tested it with your code and every thing is ok, the saved json from your python code is wrong.
import 'package:opencv_dart/opencv_dart.dart' as cv;
void main(List<String> args) {
final originImgMat = cv.imread("c43a166c11414cad2c019a3d814512dd.jpeg", flags: cv.IMREAD_COLOR);
var oWidth = originImgMat.width;
var oHeight = originImgMat.height;
var length = oWidth > oHeight ? oWidth : oHeight;
cv.Mat scaleImgMat = cv.copyMakeBorder(
originImgMat,
0,
length - oHeight,
0,
length - oWidth,
cv.BORDER_CONSTANT,
value: cv.Scalar(114, 114, 114, 0),
);
final blob = cv.blobFromImage(
scaleImgMat,
scalefactor: 1 / 255,
size: (640, 640),
swapRB: true,
);
print(blob.data.buffer.asFloat32List());
}
dart run test.dart > dart_blob.json
I noticed
img = cv2.imdecode(img,cv2.IMREAD_COLOR) // img
...
cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114)) // original_image
so the images you are processing are different, please check whether it's correct.
And ignore that why I use print
and stream redirection >
to create dart_blob.json
, I am lazy 😆
no....
I cheked before,
the "py_test_image.json"(python.#1) and "dart_origin_img_mat.json"(dart.#1) files with same md5 value,
that mean originImgMat
(dart) and original_image
(python) is 100% the same data
img = cv2.imdecode(img,cv2.IMREAD_COLOR) // img
...
cv2.copyMakeBorder(original_image,0,length-height,0,length-width,cv2.BORDER_CONSTANT, value=(114, 114, 114)) // original_image
img
and original_image
is same instance , just because I made a mistake copying the code,but only the show code , not the work code
here is the code with kaggle notebook , and work fine , this is why I'am follow python script to make flutter code logic
Allright, then the only suggestion I can give you is to check the value of blob
in python.
As you can see, your image starts with 255
, and the padding values of left and top are all zeros, then you scaled it with scalefactor=1 / 255
, so blob
should starts with 1.0
, however,
this shows it's started with the values <1.0, which is definitely incorrect.
I just synchronization the python and flutter logic code , and ..... I'am make a mistake, the problem precisely is not about copyMakeBorder
and blobFromImage
, the previously reported problem was due to an error in the debugging process.
The problems I'm struggling with really don't happen here.
we can skip the topic about copyMakeBorder
and blobFromImage
...
the problem may on next step :
I don't know how do this on my flutter code , that could be the real problem of tensorflow with not correct input
this shows it's started with the values <1.0, which is definitely incorrect.
yes,I think my dart code has changed and output json logs many times before, but I took a cache and didn't notice it, so it's not actually the same logic when compared to python
var dx = (length - oWidth) ~/ 2;
var dy = (length - oHeight) ~/ 2;
cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat, dy,dy,dx,dx, cv.BORDER_CONSTANT,
value: cv.Scalar(114, 114, 114, 0));
// dump json
cv.Mat scaleImgMat = cv.copyMakeBorder(originImgMat,0,length-height,0,length-width, cv.BORDER_CONSTANT,
value: cv.Scalar(114, 114, 114, 0));
// dump json
I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input
Yes, it could be the problem.
It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether Mat.reshape
will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions to Mat.reshape
to see whether it works, again, I am not sure.
And note that there are many breaking changes in v1.2.0, you may need to change your code.
Edit: @einsitang take a look at https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/ kind of tricky but may works.
thanks very much.
I found another lib tensor_dart
, but just only can handle 2D matrix , that is why I skip this step to checked before step.
I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input
Yes, it could be the problem.
It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether
Mat.reshape
will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions toMat.reshape
to see whether it works, again, I am not sure.And note that there are many breaking changes in v1.2.0, you may need to change your code.
reshape
and transpose
have different effects
but this link may well good help : https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/ , I try it later
anyway, thanks again for your help
will that help ? @einsitang
Mat transposeBlob(Mat blob) {
// Reshape the blob to split channels
List<Mat> channels = [];
split(blob, channels);
// Rearrange the channels to match BHWC format
Mat transposedBlob = Mat();
merge(channels, transposedBlob);
return transposedBlob;
}
edit: since transpose already exists i dont think this is correct
I don't know how do this on my flutter code , that could be the real problem of tensorflow with not truely input
Yes, it could be the problem. It's frustrating that dart has no usable multi-dimensional array libraries like numpy, and this is out of the scope of this package. I am not sure whether
Mat.reshape
will do the right things, opencv_dart v1.2.0 added the wrapper for https://docs.opencv.org/4.x/d3/d63/classcv_1_1Mat.html#ab2e41a510891e548f744832cf9b8ab89 , you can upgrade to v1.2.0 and pass a list of dimensions toMat.reshape
to see whether it works, again, I am not sure. And note that there are many breaking changes in v1.2.0, you may need to change your code.
reshape
andtranspose
have different effectsanyway, thanks again for your help
Yes, but
You can reshape (H, W, C) to (HW, C) and then perform a transpose. Transpose would give (C, HW) which can then be reshaped to (C, H, W).`
https://answers.opencv.org/question/226929/how-could-i-change-memory-layout-from-hwc-to-chw/
yes , I try later , may have good news tell your guys
Rearrange the channels to match BHWC format
I don't think so , because I'am not good math for linear algebra or tensor something
@rainyl @abdelaziz-mahdy Thanks to your help, the logic of preprocess has been debugged and passed 🍾
@rainyl @abdelaziz-mahdy Thanks to your help, the logic of preprocess has been debugged and passed 🍾
For context how did you handle the reshape and transpose part, just In case someone needs to do the same part
here is the code BCHW
to BHWC
// upgrade opencv_dart 1.2.0 to use cv.Mat.reshapeTo
_chw2hwc(cv.Mat mat){
final size = mat.size;
// final b = size[0]; // B , dont' need it
final c = size[1]; // C
final h = size[2]; // H
final w = size[3]; // W
// C,HW
cv.Mat c_hw = mat.reshapeTo(0,[c,h*w]);
// HW,C and reshape in fn outside
return c_hw.transpose();
}
// ListShape with lib : tensor_dart
var input = ListShape(_chw2hwc(blobMat).data.buffer.asFloat32List())
.reshape(inputTensor.shape);
an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?
My iphone version is too high to use xcode debugging, so I found a MI PAD 4 for testing. In general, the pre-processing time is about 350~400ms, the inference time is about 500ms, and the post-processing time is the longest, generally higher than 2s, but the hardware performance of this device is average. So I'm happy with that performance.
the latest test run on mi pad 4 (no quantization):
preprocessTimes: 366.783 ms, inferenceTimes: 475.897 ms, postprocessTimes: 2113.832 ms
preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms
mi pad 4 CPU : Octa-core (4x2.2 GHz Kryo 260 Gold & 4x1.8 GHz Kryo 260 Silver) RAM : 4G kernel : Android 8.1.0 OS : MIUI 10.3.2.0
I think I can close this issue , thank you very much
an unrelated question how do you run you yolo model, using tflite or pytorch and what do you think about the performance?
My iphone version is too high to use xcode debugging, so I found a MI PAD 4 for testing. In general, the pre-processing time is about 350~400ms, the inference time is about 500ms, and the post-processing time is the longest, generally higher than 2s, but the hardware performance of this device is average. So I'm happy with that performance.
the latest test run on mi pad 4 (no quantization): preprocessTimes: 366.783 ms, inferenceTimes: 475.897 ms, postprocessTimes: 2113.832 ms
mi pad 4 CPU : Octa-core (4x2.2 GHz Kryo 260 Gold & 4x1.8 GHz Kryo 260 Silver) RAM : 4G kernel : Android 8.1.0 OS : MIUI 10.3.2.0
interested to know why post process takes too much time, did you do it in dart? also try in release or profile mode, since its much faster for dart code due to compilation
except inference part , all the code is in dart .
I review code again, is print log use wrong variable ... (another stupid mistake😂)
the real result is :
preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms
inferenceTimes takes most time , that well make sense
except inference part , all the code is in dart .
I review code again, is print log use wrong variable ... (another stupid mistake😂)
the real result is :
preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms
inferenceTimes takes most time , that well make sense
What do you inference with? Opencv? Dnn?
If you got open source I may help you with optimizing it
Also are these results on profile or release mode?
except inference part , all the code is in dart . I review code again, is print log use wrong variable ... (another stupid mistake😂) the real result is : preprocessTimes:418.545 ms, postprocessTimes: 565.299 ms, inferenceTimes: 2020.973 ms inferenceTimes takes most time , that well make sense
What do you inference with? Opencv? Dnn?
inference with tensorflow lite
If you got open source I may help you with optimizing it
is opensource project , just for fun. if you want help me optimizing it , that is very good news for me but this feature is not code done , so I'am not push remote yet. you can star my repository , when code update ,then you can review code and help. but the repository from a long time ago , I'am chinese , english not good enough, comments is chinese and translate into english , may not good to read.
https://github.com/einsitang/sudoku-flutter
Also are these results on profile or release mode?
these results on debug mode , no try on release or profile yet.
https://www.kaggle.com/code/jianbintangelement/sudoku-digits-detect-predict-with-yolov8
this is I'am try to do on my project
use camera on real time scan picture with sudoku puzzle , recognize them and solve them
I have just finished the basic process of Sudoku detect. Thank you very much for your help. It's just that the current device I'm testing is not running very fast, a full model predicts that it will take 3s, I have two, so it will take at least 6s to complete a job. I also tried to quantify the optimization speed of the model, and the result was that the speed was doubled, but the accuracy was greatly reduced and basically unavailable, so I did not adopt the quantization scheme in the end. The problem that won't be quantified most likely is that my model is poorly trained
the latest code is aready committed and push , If you are interested, I would appreciate your input
I took a look at the impl of tflite_flutter, and I think their implementation is not efficient enough, https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L160-L176
if you pass a dart List to Interpreter.run, it will be copied 4 times at least.
I didn't test the cost but I think it's unacceptable.
I suggest you to pass the Mat.data to interpreter.run()
directly, this will avoid the first 2x copies, step further, I suggest you to open an issue in tflite_flutter
to suggest them providing another api such as runWithTensor(Tensor tensor)
to use the tensor created by users directly, then you can:
final Tensor t = interpreter.getInputTensor(0);
t.data = Mat.data; // only 1 copy, actually this will use setRange(), which will be faster than loops.
final result = interpreter.runWithTensor(t);
I took a look at the impl of tflite_flutter, and I think their implementation is not efficient enough, https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L160-L176
if you pass a dart List to Interpreter.run, it will be copied 4 times at least.
- 2x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/util/byte_conversion_utils.dart#L36-L52
- 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L166
- 1x: https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L168
I didn't test the cost but I think it's unacceptable.
I suggest you to pass the Mat.data to
interpreter.run()
directly, this will avoid the first 2x copies, step further, I suggest you to open an issue intflite_flutter
to suggest them providing another api such asrunWithTensor(Tensor tensor)
to use the tensor created by users directly, then you can:final Tensor t = interpreter.getInputTensor(0); t.data = Mat.data; // only 1 copy, actually this will use setRange(), which will be faster than loops. final result = interpreter.runWithTensor(t);
@rainyl your code review is very careful, I only understand how these libraries are used, I do not know the specific implementation logic.
I followed the code you provided, and found that it is exactly what you said.
what I'm not sure about, though, is that doing this will bypass tflite's type conversion, which I don't know enough about to know if cv.Mat
and tflite.Tensor
are consistent at the byte level, which could be a very unsafe operation
however,I can try your way to test. the code like :
final Tensor inputT = interpreter.getInputTensor(0);
inputT.data = Mat.data; // not sure it work
interpreter.invoke();
final Tensor outputT = interpreter.getOutputTensor(0);
......
will try later and tell you result
what I'm not sure about, though, is that doing this will bypass tflite's type conversion, which I don't know enough about to know if cv.Mat and tflite.Tensor are consistent at the byte level, which could be a very unsafe operation
Sure, in this way you need to be careful to ensure the types and size of Mat are match to Tensor, but Mat.data are just raw pixel values, I am not sure whether the underlying data
of Tensor are raw values, if so the this should work. https://github.com/tensorflow/flutter-tflite/blob/83a20133fc4faa0f75ae0ab3c23bbbcfac248453/lib/src/tensor.dart#L69-L79
great job , I originally thought that this part of the optimization would only provide a small performance improvement, but the modified test found that the speed was faster than I expected
this is recent performance logs,very high quality:
preprocessTimes:26.638 ms, postprocessTimes: 39.703 ms, inferenceTimes: 814.608 ms
preprocessTimes:20.872 ms, postprocessTimes: 77.102 ms, inferenceTimes: 816.108 ms
Sounds good.
我看了一下你的资料,你fork一些repo是全中文的,且姓liu,你应该是个中国人….我想我可以用国语向你致谢
哈哈哈哈哈是的,没事儿🤣
哈哈哈哈哈是的,没事儿🤣
方便问一下,你现在还是学生吗,就读武汉大学?
Read README carefully first Star :star: this project if you want to ask a question, no star, no answer already star 😁
Question
I use yolov8 tflite model on flutter , the python code work well, so I just try to find opencv python api
cv2.dnn.blobFromImage
in this package