vladiH / flutter_vision

A Flutter plugin for managing both Yolov5 model and Tesseract v4, accessing with TensorFlow Lite 2.x. Support object detection, segmentation and OCR on both iOS and Android.
https://pub.dev/packages/flutter_vision
MIT License
70 stars 30 forks source link

Please help me with getting coordinates on image #30

Closed mahdi-madadi closed 11 months ago

mahdi-madadi commented 11 months ago

Here is the code which I provided to detect some text on images then, I want to crop the image and display only the region of the image which is detected by yolov8 model, but it does not work properly, and crop somewhere else: ` yoloOnImage() async { yoloResults.clear(); Uint8List byte = await imageFile!.readAsBytes(); img.Image? originalImage = img.decodeImage(byte);

final result = await widget.vision.yoloOnImage(
    bytesList: byte,
    imageHeight: originalImage!.height,
    imageWidth: originalImage.width,
    iouThreshold: 0.2,
    confThreshold: 0.2,
    classThreshold: 0.2);
if (result.isNotEmpty) {
  // Sort the results by confidence score in descending order
  result.sort((a, b) => b["confidence"].compareTo(a["confidence"]));

  // Keep only the result with the highest confidence score
  var bestResult = result[0];
  print(bestResult);
  // Assuming you have the bounding box coordinates
  double x = bestResult["box"]
      [0]; // x coordinate of the top left corner of the bounding box
  double y = bestResult["box"]
      [1]; // y coordinate of the top left corner of the bounding box
  double w = bestResult["box"][2]; // width of the bounding box
  double h = bestResult["box"][3]; // height of the bounding box

  // Convert normalized coordinates to pixel coordinates
  int left = (x * originalImage.width).round();
  int top = (y * originalImage.height).round();
  int width = (w * originalImage.width).round();
  int height = (h * originalImage.height).round();

  img.Image cropped = img.copyCrop(
    originalImage,
    x: left,
    y: top,
    width: width,
    height: height,
  );

  // Get the temporary directory path.
  final tempDir = await getTemporaryDirectory();

  // Create a temporary file with a unique name.
  final file =
      await File('${tempDir.path}/${DateTime.now().toIso8601String()}.png')
          .create();

  // Write the bytes of the cropped image to the file.
  await file.writeAsBytes(img.encodePng(cropped));

  setState(() {
    yoloResults = [bestResult];
    imageFile = file;
  });
}

} } `

vladiH commented 11 months ago

@mahdi-madadi Your code should work ok with these changed. Please visit flutter_vision output documentation 😃 .

final result = await widget.vision.yoloOnImage(
    bytesList: byte,
    imageHeight: originalImage!.height,
    imageWidth: originalImage.width,
    iouThreshold: 0.2,
    confThreshold: 0.2,
    classThreshold: 0.2);
if (result.isNotEmpty) {
  // Sort the results by confidence score in descending order
  result.sort((a, b) => b["confidence"].compareTo(a["confidence"]));

  // Keep only the result with the highest confidence score
  var bestResult = result[0];
  print(bestResult);
  // Assuming you have the bounding box coordinates
 // "box": [x1:left, y1:top, x2:right, y2:bottom, class_confidence]
  double x1 = bestResult["box"][0]; 
  double y1 = bestResult["box"] [1]; 
  double x2 = bestResult["box"][2]; 
  double y2 = bestResult["box"][3]; 

  // Convert normalized coordinates to pixel coordinates
  int left = x1;
  int top = y1;
  int width = x2 -x1;
  int height = y2-y1;

  img.Image cropped = img.copyCrop(
    originalImage,
    x: left,
    y: top,
    width: width,
    height: height,
  );

  // Get the temporary directory path.
  final tempDir = await getTemporaryDirectory();

  // Create a temporary file with a unique name.
  final file =
      await File('${tempDir.path}/${DateTime.now().toIso8601String()}.png')
          .create();

  // Write the bytes of the cropped image to the file.
  await file.writeAsBytes(img.encodePng(cropped));

  setState(() {
    yoloResults = [bestResult];
    imageFile = file;
  });
}
mahdi-madadi commented 11 months ago

Can you please tell me how change these normalized values into pixel values: // Convert normalized coordinates to pixel coordinates int left = x1; int top = y1; int width = x2 -x1; int height = y2-y1; because right now it just makes all the value zero as they are less than 1

mahdi-madadi commented 11 months ago

Can you please tell me how change these normalized values into pixel values: // Convert normalized coordinates to pixel coordinates int left = x1; int top = y1; int width = x2 -x1; int height = y2-y1; because right now it just makes all the value zero as they are less than 1, also it gives this error: type 'double' is not a subtype of type 'int' in type cast

vladiH commented 11 months ago

Are you using an older version of the YoloV8 model, or has a new model been generated recently? Could you share a sample output value for one of the detected boxes?("box": [x1:left, y1:top, x2:right, y2:bottom, class_confidence]) This will help me better understand the issue and provide a more precise solution

mahdi-madadi commented 11 months ago

I am using a model which I trained by myself recently which is working properly on google-colab, and I can crop the region of interest and extract the desired text whiten detected region, but I can not do it here in flutter, and this is the output of the model: {box: [0.5, 0.635284423828125, 0.8623046875, 0.684832751750946, 0.78662109375], tag: Drug-Name}

vladiH commented 11 months ago

It appears that you are using the latest release of flutter_vision, which is aligned with the latest version of ultralytics. However, the output you're seeing is normalized.

I recommend trying flutter_vision version 1.1.3, as it is known to work well with the previous ultralytics code. Please switch to version 1.1.3, test it, and let me know how it goes. This version should provide the expected results with the older ultralytics code.

Feel free to share the output and your experience after making this change. If you encounter any issues or have further questions, don't hesitate to reach out for assistance.

mahdi-madadi commented 11 months ago

thank you.

mahdi-madadi commented 11 months ago

But I am already using flutter_vision version of 1.1.3

vladiH commented 11 months ago

Try 1.1.4 please

mahdi-madadi commented 11 months ago

The worst is that, which even the example code which is available in flutter vision does not work properly for me, it does not give me any error, but while using it does not draw bounding box around the detected region even if the model detect something on image. For example bellow is the result of yolov8 model on code which is available in this github repository, but it does not draw bounding box: [{box: [0.72412109375, 0.3818572759628296, 1.02685546875, 0.4799102544784546, 0.83251953125], tag: Drug-Name}]

vladiH commented 11 months ago

What you're experiencing is indeed unusual. Even the test examples are not working as expected, and with the version of flutter_vision 1.1.4, it should be able to run the model generated by Ultralystics. It's quite perplexing. I'll try running it in a local environment to see if I can reproduce the issue

mahdi-madadi commented 11 months ago

thank you

mahdi-madadi commented 11 months ago

I would be grateful if you update the bellow code that after detecting the desired region on image, it must crop the image and display it instead of the original image: yoloOnImage() async { yoloResults.clear(); Uint8List byte = await imageFile!.readAsBytes(); final image = await decodeImageFromList(byte); imageHeight = image.height; imageWidth = image.width; final result = await widget.vision.yoloOnImage( bytesList: byte, imageHeight: image.height, imageWidth: image.width, iouThreshold: 0.8, confThreshold: 0.4, classThreshold: 0.5); if (result.isNotEmpty) { setState(() { yoloResults = result; }); } }

vladiH commented 11 months ago

I've just generated a new model using Ultralystics, and I'm using flutter_vision version 1.1.4, and everything seems to be working correctly on my end. It appears that something might be amiss in your setup. You may want to try generating a new model at the following URL to see if that resolves the issue: (yolov8 colab)[https://github.com/ultralytics/ultralytics]

1)

Captura de pantalla 2023-09-24 194704

2)

Captura de pantalla 2023-09-24 194747

3) Use the generated model

Captura de pantalla 2023-09-24 195536

4) Result

Captura de pantalla 2023-09-24 195536
mahdi-madadi commented 11 months ago

Hey, I have solved the issue with drawing the bounding box, but I couldn't solve the problem with cropping the detected region. Here is the code which I wrote to crop the detected region and then I will display this cropped image on screen. Although the code detect the region of interest properly and draw bounding box around it, it can not crop it properly. Please check my code and fix the issue: ` List displayCroppedImages() { if (yoloResults.isEmpty) return [];

double factorX = screenWidth / (imageWidth);
double imgRatio = imageWidth / imageHeight;
double newWidth = imageWidth * factorX;
double newHeight = newWidth / imgRatio;
double factorY = newHeight / (imageHeight);

double pady = (screenHeight - newHeight) / 2;

// Create a copy of the original image
img.Image originalImage = img.decodeImage(imageFile!.readAsBytesSync())!;

yoloResults.forEach((result) async {
  double left = result["box"][0] * factorX;
  double top = result["box"][1] * factorY + pady;
  double width = (result["box"][2] - result["box"][0]) * factorX;
  double height = (result["box"][3] - result["box"][1]) * factorY;

  // Crop the detected region from the original image
  img.Image croppedImage = img.copyCrop(
    originalImage,
    x: left.toInt(),
    y: top.toInt(),
    width: width.toInt(),
    height: height.toInt(),
  );

  // Get the temporary directory path.
  final tempDir = await getTemporaryDirectory();

  // Create a temporary file with a unique name.
  final file =
      await File('${tempDir.path}/${DateTime.now().toIso8601String()}.png')
          .create();

  // Write the bytes of the cropped image to the file.
  await file.writeAsBytes(img.encodePng(croppedImage));

  setState(() {
    imageFile = file;
  });
});

return [];

}

`

vladiH commented 11 months ago

Hi @mahdi-madadi!!,

We discussed this in a previous comment. To conclude this thread, I will provide you with some functional code, and you only need to use it in your code

import 'dart:io';
import 'dart:typed_data';
import 'dart:ui';

import 'package:flutter/material.dart';
import 'package:flutter_speed_dial/flutter_speed_dial.dart';
import 'dart:async';
import 'package:flutter_vision/flutter_vision.dart';
import 'package:image_picker/image_picker.dart';

import 'package:image/image.dart' as cimage;

enum Options { none, imagev5, imagev8, imagev8seg, frame, tesseract, vision }

main() async {
  WidgetsFlutterBinding.ensureInitialized();
  DartPluginRegistrant.ensureInitialized();
  runApp(
    const MaterialApp(
      home: MyApp(),
    ),
  );
}

class MyApp extends StatefulWidget {
  const MyApp({Key? key}) : super(key: key);

  @override
  State<MyApp> createState() => _MyAppState();
}

class _MyAppState extends State<MyApp> {
  late FlutterVision vision;
  Options option = Options.none;
  @override
  void initState() {
    super.initState();
    vision = FlutterVision();
  }

  @override
  void dispose() async {
    super.dispose();
    await vision.closeTesseractModel();
    await vision.closeYoloModel();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      body: task(option),
      floatingActionButton: SpeedDial(
        //margin bottom
        icon: Icons.menu, //icon on Floating action button
        activeIcon: Icons.close, //icon when menu is expanded on button
        backgroundColor: Colors.black12, //background color of button
        foregroundColor: Colors.white, //font color, icon color in button
        activeBackgroundColor:
            Colors.deepPurpleAccent, //background color when menu is expanded
        activeForegroundColor: Colors.white,
        visible: true,
        closeManually: false,
        curve: Curves.bounceIn,
        overlayColor: Colors.black,
        overlayOpacity: 0.5,
        buttonSize: const Size(56.0, 56.0),
        children: [
          SpeedDialChild(
            //speed dial child
            child: const Icon(Icons.video_call),
            backgroundColor: Colors.red,
            foregroundColor: Colors.white,
            label: 'Yolo on Frame',
            labelStyle: const TextStyle(fontSize: 18.0),
            onTap: () {
              setState(() {
                option = Options.imagev8;
              });
            },
          ),
        ],
      ),
    );
  }

  Widget task(Options option) {
    if (option == Options.imagev8) {
      return YoloImageV8(vision: vision);
    }
    return const Center(child: Text("Choose Task"));
  }
}

class YoloImageV8 extends StatefulWidget {
  final FlutterVision vision;
  const YoloImageV8({Key? key, required this.vision}) : super(key: key);

  @override
  State<YoloImageV8> createState() => _YoloImageV8State();
}

class _YoloImageV8State extends State<YoloImageV8> {
  late List<Map<String, dynamic>> yoloResults;
  File? imageFile;
  int imageHeight = 1;
  int imageWidth = 1;
  bool isLoaded = false;
  late Uint8List byte;
  @override
  void initState() {
    super.initState();
    loadYoloModel().then((value) {
      setState(() {
        yoloResults = [];
        isLoaded = true;
      });
    });
  }

  @override
  void dispose() async {
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    final Size size = MediaQuery.of(context).size;
    if (!isLoaded) {
      return const Scaffold(
        body: Center(
          child: Text("Model not loaded, waiting for it"),
        ),
      );
    }
    return Center(
      child: SingleChildScrollView(
        child: Column(
          children: [
            imageFile != null ? Image.file(imageFile!) : const SizedBox(),
            Align(
              alignment: Alignment.bottomCenter,
              child: Row(
                mainAxisAlignment: MainAxisAlignment.center,
                children: [
                  TextButton(
                    onPressed: pickImage,
                    child: const Text("Pick an image"),
                  ),
                  ElevatedButton(
                    onPressed: yoloOnImage,
                    child: const Text("Detect"),
                  )
                ],
              ),
            ),
            ...displayBoxesAroundRecognizedObjects(size),
          ],
        ),
      ),
    );
  }

  Future<void> loadYoloModel() async {
    await widget.vision.loadYoloModel(
        labels: 'assets/labels.txt',
        modelPath: 'assets/yolov8n.tflite',
        modelVersion: "yolov8",
        quantization: false,
        numThreads: 2,
        useGpu: true);
    setState(() {
      isLoaded = true;
    });
  }

  Future<void> pickImage() async {
    final ImagePicker picker = ImagePicker();
    // Capture a photo
    final XFile? photo = await picker.pickImage(source: ImageSource.gallery);
    if (photo != null) {
      setState(() {
        imageFile = File(photo.path);
      });
    }
  }

  yoloOnImage() async {
    yoloResults.clear();
    byte = await imageFile!.readAsBytes();
    final image = await decodeImageFromList(byte);
    imageHeight = image.height;
    imageWidth = image.width;
    final result = await widget.vision.yoloOnImage(
        bytesList: byte,
        imageHeight: image.height,
        imageWidth: image.width,
        iouThreshold: 0.8,
        confThreshold: 0.4,
        classThreshold: 0.5);
    if (result.isNotEmpty) {
      setState(() {
        yoloResults = result;
      });
    }
  }

  List<Widget> displayBoxesAroundRecognizedObjects(Size screen) {
    if (yoloResults.isEmpty) return [];
    return yoloResults.map((result) {
      return Image.memory(cropImage(
          byte,
          result["box"][0].toInt(),
          result["box"][1].toInt(),
          (result["box"][2] - result["box"][0]).toInt(),
          (result["box"][3] - result["box"][1]).toInt()));
    }).toList();
  }

// This is the more important code
  Uint8List cropImage(
      Uint8List imageBytes, int x, int y, int width, int height) {
    cimage.Image image = cimage.decodeImage(imageBytes)!;
    cimage.Image croppedImage =
        cimage.copyCrop(image, x: x, y: y, width: width, height: height);
    return Uint8List.fromList(cimage.encodePng(croppedImage));
  }
}