ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.26k stars 16.23k forks source link

Yolov5n on TFServing #11584

Closed JrGxllxgo closed 1 year ago

JrGxllxgo commented 1 year ago

Search before asking

Question

Hello, I trained with a custom dataset a model on Ultralytics Google Colab for object detection. I exported the model as saved_model fotrmat to upload to my TFServing docker, when I do the request to the docker the return I get is an array of +25000 lines, it seems like a numpy array but I´m not sure.

It is supposed to return an array whith the information of what the model found.

What can I do for getting this array and not that big array? Is there any problem to upload the yolov5n custom model on Tensor Flow Serving?

Additional

Captura de pantalla 2023-05-24 091119

github-actions[bot] commented 1 year ago

👋 Hello @JrGxllxgo, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 1 year ago

@JrGxllxgo hello!

It sounds like you're experiencing some issues getting your yolov5n custom model to work with Tensor Flow Serving. The large array you're seeing might be related to how the model is processing the data. To help you better, could you provide more details and information on how you are making the request to the TFServing Docker, as well as the payload (input data) you are sending?

Also, which version of the Ultralytics YOLOv5 were you using for training the model?

Please let us know so we can provide further assistance.

JrGxllxgo commented 1 year ago

@glenn-jocher thks for help me, I´ll provide information, I hope you find it useful. All codes, enviroments... are from the google Collab of Ultralytics

train_output export_command 1st step train_command

glenn-jocher commented 1 year ago

Hi @JrGxllxgo, thank you for providing all of that helpful information. It looks like everything is set up correctly, and it's hard to say for certain without more information, but I may have a few potential solutions.

Firstly, it may be that the large numpy array you're seeing is the correct output from the model. Specifically, this could be what is known as a raw output of the detection. Depending on what you are trying to do, it may be necessary to do some additional post-processing and filtering of this output to get the results you're after.

Another potential solution could be to check if the model was saved successfully as a SavedModel format and is compatible with TFServing. You can verify this by checking if you can load the model back and run inference on it in your local environment.

Finally, if none of the solutions work, I'd recommend checking out the TFServing documentation and seeing if there are any potential issues specific to that part of your workflow. Hope this helps, let me know if you have any further questions!

JrGxllxgo commented 1 year ago

Hi @glenn-jocher , thks you for giving me some solutions.

If I get this result when I do a request it means that the model loaded correctly, right? imagen

During this morning I´ll try to do some post-processing or search if I the docker with TFServing should have a different build for object_detection.

glenn-jocher commented 1 year ago

Hi @JrGxllxgo, you're welcome! Glad to be of help.

Regarding your question, yes, it seems like the model has loaded correctly. The output you are seeing could be the raw output of the detection, and it may require further post-processing and filtering to get the results you're after.

I hope you're able to make some progress with implementing the post-processing techniques. If you continue to have issues, it may be worth looking into whether the Docker container with TFServing requires a different build for object detection.

Best of luck, and feel free to reach out if you have any more questions.

JrGxllxgo commented 1 year ago

Hi @glenn-jocher again!

Which parameters I should change on the training to get the typical JSON with the predictions with accuracy and coords?

I think it can be useful to resolve the problem or another way to face the problem because y post-processed the image and is 25200x1 px

glenn-jocher commented 1 year ago

Hi @JrGxllxgo, great question! To get the typical JSON output with the predictions including accuracy and coordinates, you will need to enable the --save-json flag during training and export. Here's an example training command to include this flag:

python train.py --img 640 --batch 16 --epochs 30 --data data.yaml --cfg models/yolov5n.yaml --weights '' --save-json --cache

Additionally, during export, you'll need to include the --dynamic flag to ensure that the model is exported with dynamic shapes that support variable batch sizes. Here's an example export command:

python export.py --weights runs/train/exp/weights/best.pt --include=trt --dynamic --verbose

Please note that you'll need to specify your own YAML file in the --data flag and YOLOv5 model configuration in the --cfg flag based on your specific project.

I hope this helps! Let me know if you have any other questions.

JrGxllxgo commented 1 year ago

Hi @glenn-jocher thks for that option!

If I want to train but locally with a docker but I can´t use NVIDIA GPU, just CPU, which docker image and Notebook you recommend to me??

glenn-jocher commented 1 year ago

Hello @JrGxllxgo, you're welcome!

If you want to train the YOLOv5 model locally on your CPU using Docker, you can use the ultralytics/yolov5 Docker image. Here's an example Docker command to run the image:

docker run --gpus all -it --rm -v "$(pwd)":/app ultralytics/yolov5:latest python train.py --img 640 --batch 16 --epochs 50 --data /app/data.yaml --cfg /app/models/yolov5s.yaml --weights ''

However, since you do not have an NVIDIA GPU, you'll need to modify the Docker command to use the CPU only. Here's an example of how to modify the command to run YOLOv5 on CPU only:

docker run -it --rm -v "$(pwd)":/app ultralytics/yolov5:latest python train.py --img 640 --batch 16 --epochs 50 --data /app/data.yaml --cfg /app/models/yolov5s.yaml --weights '' --device cpu

This will train the model with YOLOv5s configuration for 50 epochs on CPU only. Please note that without a GPU, the training may take significantly longer, so you may want to experiment with smaller batch sizes and fewer epochs to keep the training time manageable.

I hope this helps! Let me know if you have any other questions.

JrGxllxgo commented 1 year ago

Hi @glenn-jocher let's try this.

The docker should have Python>=3.7 version and PyTorch>=1.7, isn´t it?

I'll ask you the last question (I hope), can all YolovX be exported as saved_model format? I saw yesterday the UltralitycsHUB website and in the Yolo models there was a section to see how to export them and not all could be exported as saved_model.

Thks u for all!!

imagen

glenn-jocher commented 1 year ago

Hi @JrGxllxgo,

Yes, you are correct. The Docker image used for training with YOLOv5 should have Python version >= 3.7 and PyTorch version >= 1.7. You can find more information about the requirements on the YOLOv5 documentation.

Regarding your question about exporting YOLOvX to the SavedModel format, not all YOLO models can be exported in the SavedModel format. As of now, only YOLOv5 models can be exported in the SavedModel format using the export.py script provided in the official Ultralytics' YOLOv5 repository. However, other models can be exported in different formats like the ONNX format, which can be loaded and used in different frameworks.

I hope this helps! Don't hesitate to ask if you have any further questions.

JrGxllxgo commented 1 year ago

Hi @glenn-jocher ! I started the training locally with docker, thks u a lot!

This error is caused by the training without GPU?

imagen

JrGxllxgo commented 1 year ago

Hi @glenn-jocher ! Finally I'm doing on an anaconda environment with all the requirements, pytorch and python. (with cpu)

I got this error with the command

imagen

glenn-jocher commented 1 year ago

@JrGxllxgo hello! It's great to hear that you're making progress with your training. Regarding the error you're encountering, it seems like the error message is indicating that it cannot import the COCOEvaluator class from the detectron2.evaluation package.

This can be caused if you are using an incorrect version of the Detectron2 library, which may not have included this class. One potential solution is to try reinstalling the Detectron2 library and ensuring that you are using a version that includes this class.

You can try running this command to reinstall the correct version of Detectron2:

pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.9/index.html

This command installs the CPU version of Detectron2 version 0.5 (which is compatible with PyTorch 1.9), and may resolve the issue you are encountering.

If this doesn't work, please let me know, and we can explore other troubleshooting steps.

JrGxllxgo commented 1 year ago

@glenn-jocher hello!

I tried to run the command and here is the error I get

imagen

And this is the pytorch I get

imagen

glenn-jocher commented 1 year ago

@JrGxllxgo hello there!

Thanks for reaching out to us.

The error seems to be caused by a version mismatch between PyTorch and Detectron2. To clarify, which version of PyTorch and which version of Detectron2 are you currently using?

Also, can you please try running the following command to install Detectron2 and its dependencies, and see if this resolves the issue?

pip install -U 'git+https://github.com/facebookresearch/fvcore'
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
pip install -U detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.7/index.html

Once you have installed Detectron2, please try running the command again and let us know if it works.

Thank you and please let us know if you have any additional questions or concerns.

JrGxllxgo commented 1 year ago

Hi @glenn-jocher

I did your commands but I get these errors:

imagen

So I tried this another method and I got errors imagen

Doing on an anaconda env could be a problem?

JrGxllxgo commented 1 year ago

Hi @glenn-jocher again!

I have tried to put same commands on the Anaconda powershell as admin and these are the results y get:

pip install -U 'git+https://github.com/facebookresearch/fvcore' imagen

pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI' imagen imagen

pip install -U detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.7/index.html imagen

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

glenn-jocher commented 11 months ago

Hi @JrGxllxgo,

It looks like there are several issues when installing the required packages for Detectron2. Working within an Anaconda environment should not be a problem, but it seems that the installation commands are not completing successfully.

Let's try a different approach. First, remove any previously installed versions of Detectron2 and its dependencies, and then try the following commands:

  1. Install fvcore:

    pip uninstall -y fvcore
    pip install fvcore==0.1.5.post20210605
  2. Install pycocotools:

    pip uninstall -y pycocotools
    pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
  3. Install detectron2:

    pip uninstall -y detectron2
    pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.7/index.html

Please run these commands one by one and carefully observe any error messages that may occur. If you encounter any issues, please provide the error messages so that we can further assist you.

Thank you for your patience, and I'm here to help with any further questions or concerns you may have.