Closed samygarg closed 3 years ago
👋 Hello @samygarg, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
I have the same issue.
The environment I have is: OS: Ubuntu 18.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
I also get the same issue.
The environment I have is: OS: Ubuntu 20.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
Hey everyone, it appears that optimize_for_mobile
from torch
is what causes the incompatibility issue with coremltools
.
The solution is to comment the line before export. Optimally it should be an arg option, pull request anyone? https://github.com/ultralytics/yolov5/blob/33712d6dd0cc54e28b97d56cb999aa050a1c94ef/models/export.py#L72
Thanks @JorgeCeja, that solution worked for me.
@JorgeCeja It solved the original issue but it still doesn't work. Tried on colab as well as macbook pro.
Here's what I am getting:
CoreML: starting export with coremltools 4.1...
Tuple detected at graph output. This will be flattened in the converted model.
Converting graph.
Adding op '1' of type const
Adding op '2' of type const
...
Converting op 728 : sub
Adding op '728' of type sub
Converting op 729 : add
Adding op '729' of type add
Converting op 730 : select
Converting Frontend ==> MIL Ops: 78% 545/695 [00:00<00:00, 933.85 ops/s]
CoreML: export failure:
Thanks @JorgeCeja, the solution you provided worked for me.
Hey everyone, it appears that
optimize_for_mobile
fromtorch
is what causes the incompatibility issue withcoremltools
.The solution is to comment the line before export. Optimally it should be an arg option, pull request anyone? https://github.com/ultralytics/yolov5/blob/33712d6dd0cc54e28b97d56cb999aa050a1c94ef/models/export.py#L72
@JorgeCeja Can you share your environment? This change fixed the original issue, but I still face the same issue as @samygarg
It seems CoreML export is broken in multiple ways currently. The export did work with the above change until very recently. However only when not specifying --grid
, which meant that the Detect module did not get exported. When trying to export with --grid
you would get the same export failure at op 730.
Commit b292837 from issue #2982 (from May 3rd) changed the export implementation to export the Detect module by default.
After trying many different previous commits (and different Pytorch versions) today my impression is that exporting the whole network including the Detect module to CoreML probably never worked? If anyone knows of a version/commit (and environment) where it did work I would love to know.
@JorgeCeja It solved the original issue but it still doesn't work. Tried on colab as well as macbook pro.
Here's what I am getting:
CoreML: starting export with coremltools 4.1... Tuple detected at graph output. This will be flattened in the converted model. Converting graph. Adding op '1' of type const Adding op '2' of type const ... Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 78% 545/695 [00:00<00:00, 933.85 ops/s] CoreML: export failure:
I meet the same question, my enviroment: OS: Ubuntu 16.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
error message: …… Converting op 725 : constant Adding op '725' of type const Converting op 726 : mul Adding op '726' of type mul Converting op 727 : constant Adding op '727' of type const Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 87%|███▍| 604/695 [00:00<00:00, 1149.09 ops/s] CoreML: export failure:
hope somebody gave advice, thanks!
@meng1994412 @haynec @JorgeCeja @samygarg good news 😃! Your original issue may now been fixed ✅ in PR #3055. Note that this does not solve CoreML export completely, but it should resolve the original error message in this issue.
To receive this update you can:
git pull
from within your yolov5/
directorygit clone https://github.com/ultralytics/yolov5
againmodel = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!
@JorgeCeja It solved the original issue but it still doesn't work. Tried on colab as well as macbook pro. Here's what I am getting:
CoreML: starting export with coremltools 4.1... Tuple detected at graph output. This will be flattened in the converted model. Converting graph. Adding op '1' of type const Adding op '2' of type const ... Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 78% 545/695 [00:00<00:00, 933.85 ops/s] CoreML: export failure:
I meet the same question, my enviroment: OS: Ubuntu 16.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
error message: …… Converting op 725 : constant Adding op '725' of type const Converting op 726 : mul Adding op '726' of type mul Converting op 727 : constant Adding op '727' of type const Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 87%|███▍| 604/695 [00:00<00:00, 1149.09 ops/s] CoreML: export failure:
hope somebody gave advice, thanks!
@glenn-jocher using current version(hotfixed), Still this issue is happen with me.
OS: Ubuntu 20.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
@meng1994412 @haynec @JorgeCeja @samygarg good news 😃! Your original issue may now been fixed ✅ in PR #3055. Note that this does not solve CoreML export completely, but it should resolve the original error message in this issue.
To receive this update you can:
git pull
from within youryolov5/
directorygit clone https://github.com/ultralytics/yolov5
again- Force-reload PyTorch Hub:
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
- View our updated notebooks:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!
@glenn-jocher using current version(hotfixed), Still this issue is happen with me.
OS: Ubuntu 20.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
Adding op '724' of type slice_by_index Adding op '724_begin_0' of type const Adding op '724_end_0' of type const Adding op '724_end_mask_0' of type const Converting op 725 : constant Adding op '725' of type const Converting op 726 : mul Adding op '726' of type mul Converting op 727 : constant Adding op '727' of type const Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 87%|████▎| 604/695 [00:00<00:00, 970.22 ops/s] CoreML: export failure:
Still this issue is happen with me.
@jedikim He mentioned that this does not (yet?) fix CoreML export, it only fixes the particular issue reported in this bug report (the first post at the top).
@JorgeCeja It solved the original issue but it still doesn't work. Tried on colab as well as macbook pro. Here's what I am getting:
CoreML: starting export with coremltools 4.1... Tuple detected at graph output. This will be flattened in the converted model. Converting graph. Adding op '1' of type const Adding op '2' of type const ... Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 78% 545/695 [00:00<00:00, 933.85 ops/s] CoreML: export failure:
I meet the same question, my enviroment: OS: Ubuntu 16.04 Packages (by using pip install -r requirement.txt): torch==1.8.1 torchvision==0.9.1 coremltools==4.1 onnx==1.9.0 scikit-learn==0.19.2
error message: …… Converting op 725 : constant Adding op '725' of type const Converting op 726 : mul Adding op '726' of type mul Converting op 727 : constant Adding op '727' of type const Converting op 728 : sub Adding op '728' of type sub Converting op 729 : add Adding op '729' of type add Converting op 730 : select Converting Frontend ==> MIL Ops: 87%|███▍| 604/695 [00:00<00:00, 1149.09 ops/s] CoreML: export failure:
hope somebody gave advice, thanks!
I try it again after update, but it output the same err result with yesterday, so as glenn-jocher mentioned above: this does not solve CoreML export completely
Here are my two cents on this:
You can checkout a previous commit such as 33712d6dd0cc54e28b97d56cb999aa050a1c94ef
and comment line
https://github.com/ultralytics/yolov5/blob/33712d6dd0cc54e28b97d56cb999aa050a1c94ef/models/export.py#L72
as they said above. However, as @pocketpixels said, this will not export the complete model. Instead the outputs will be the nl
outputs given by:
https://github.com/ultralytics/yolov5/blob/33712d6dd0cc54e28b97d56cb999aa050a1c94ef/models/yolo.py#L48
Which means you have to do the grid scaling operations in the CoreML side, and concatenate the nl
results to obtain a [n_achors x (nc+5)] matrix.
Then you will need to adapt this to the input format of the Non Maxima Suppression layer:
boxes_scores
and boxes_coords
arraysI'm pretty new to using CoreM builder. So far I'm using this as my guideline: https://github.com/hollance/coreml-survival-guide/blob/master/MobileNetV2%2BSSDLite/ssdlite.py If anyone knows how to do it and could post the complete solution it would be great. Otherwise, I'll be working on that, and once I finish (if I do) I'll post it here.
@meng1994412 @haynec @JorgeCeja @samygarg @glemarivero good news 😃! Outstanding CoreML export issues may now been fixed ✅ in a second PR #3066. This adds a --train
option suitable for CoreML model export which exports the model in .train()
mode rather than .eval()
mode, avoiding the grid construction code that causes CoreML export to fail:
python models/export.py --train
All batchnorm fusion ops have already occured at the new model.train()
point, so the only difference should be in the Detect layer.
To receive this update you can:
git pull
from within your yolov5/
directorygit clone https://github.com/ultralytics/yolov5
againmodel = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!
@meng1994412 @haynec @JorgeCeja @samygarg @glemarivero good news 😃! Outstanding CoreML export issues may now been fixed ✅ in a second PR #3066. This adds a
--train
option suitable for CoreML model export which exports the model in.train()
mode rather than.eval()
mode, avoiding the grid construction code that causes CoreML export to fail:python models/export.py --train
All batchnorm fusion ops have already occured at the new
model.train()
point, so the only difference should be in the Detect layer.To receive this update you can:
* `git pull` from within your `yolov5/` directory * `git clone https://github.com/ultralytics/yolov5` again * Force-reload [PyTorch Hub](https://pytorch.org/hub/ultralytics_yolov5/): `model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)` * View our updated notebooks: [![Open In Colab](https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667)](https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb) [![Open In Kaggle](https://camo.githubusercontent.com/a08ca511178e691ace596a95d334f73cf4ce06e83a5c4a5169b8bb68cac27bef/68747470733a2f2f6b6167676c652e636f6d2f7374617469632f696d616765732f6f70656e2d696e2d6b6167676c652e737667)](https://www.kaggle.com/models/ultralytics/yolov5)
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!
yeah!!! I input a cmd: python models/export.py --train --weights yolov5s.pt --img 640 --batch 1, and it works ok with no error!!! thanks a lot, 3x!
Thanks for adding the --train
option. But we still can't use the CoreML model for inference, right?
Or am I missing something?
@glemarivero yes the exported model can be used for any purpose.
I meant that we still need to do what I put earlier. Aren't the outputs of the model still 714
, 727
and 740
?
Thanks
I agree with @pocketpixels and @glemarivero. The CoreML model (the latest update)currently got exported does not contain any detect module. Thus the CoreML model cannot be directly used for inference.
@meng1994412 @haynec @JorgeCeja @samygarg @glemarivero good news ! Outstanding CoreML export issues may now been fixed in a second PR #3066. This adds a
--train
option suitable for CoreML model export which exports the model in.train()
mode rather than.eval()
mode, avoiding the grid construction code that causes CoreML export to fail:python models/export.py --train
All batchnorm fusion ops have already occured at the new
model.train()
point, so the only difference should be in the Detect layer.To receive this update you can:
git pull
from within youryolov5/
directorygit clone https://github.com/ultralytics/yolov5
again- Force-reload PyTorch Hub:
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
- View our updated notebooks:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 !
Will the grid construction be included in CoreML export in the future update?
It definitely would be desirable to have the detect module included in the CoreML output. And if and when we can get that to work it might also be worthwhile to add a CoreML NMS layer to the generated CoreML model (as discussed by @glemarivero).
@glenn-jocher Do you happen to know which part of the Detect implementation the CoreML converter chokes on? Maybe it could be possible to find a workaround by reformulating one of the Pytorch operations involved?
I looked into what is causing the export failure a bit.
What I found so far is that it is related to self.stride
and self.anchor_grid
in the box calculations here:
https://github.com/ultralytics/yolov5/blob/d2a17289c99ad45cb901ea81db5932fa0ca9b711/models/yolo.py#L55-L61
If we comment out or remove those from the calculations then the CoreML conversion runs to completion (accessing and using self.grid
in those calculations seems to be fine).
I have not yet figured out though why these are causing problems. With anchor_grid
I initially suspected it could be that the tensor rank is higher than CoreML can handle. However stride
is just a vector of 3 floats. It gets set outside of the module's init, maybe that could be causing the issue somehow?
I'll look into this more later, but thought I'd share what I found so far in case someone else (who is maybe more experienced with Pytorch & CoreML) has ideas and/or wants to investigate further.
Hi, I was able to put everything together. Take a look at this notebook: example_yolov5s_to_coreml.ipynb.zip Please let me know if you find any errors. It is only for the yolov5 small version, but it shouldn't be difficult to adapt it to the others. Hope is useful for the rest 🙂
@glemarivero Fantastic work, thank you for sharing!
Continuing my investigation into the cause for the error during the CoreML export of the Detect module:
Just focusing on the --inplace
branch in the code cited above, so these two lines:
https://github.com/ultralytics/yolov5/blob/d2a17289c99ad45cb901ea81db5932fa0ca9b711/models/yolo.py#L56-L57
With these modifications the CoreML conversion completes without errors:
s = self.stride[i].item()
ag = self.anchor_grid[i].numpy()
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * s # xy
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * ag # wh
That is if we force Pytorch to treat stride
and anchor_grid
as constants and forget how they were computed (which I believe should be ok, because they are not input dependend?) then the CoreML converter has no issues.
(I have not tried running the resulting model on iOS yet).
Clearly the above change is not the solution (as I believe it would impact inference performance), but maybe it is a good hint at what a better solution might be (for someone like @glenn-jocher who understands the code base and PyTorch better than I do)?
Update: While the conversion completes, looking at the resulting graph in Netron I don't think it actually includes the box coordinate computations.
Update 2:
Converting without --inplace
and making the equivalent changes to that branch of the code does result in a model that seems to include the box coordinate computations.
s = self.stride[i].item()
ag = self.anchor_grid[i].view(1, self.na, 1, 1, 2).numpy()
xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * s # xy
wh = (y[..., 2:4] * 2) ** 2 * ag # wh
y = torch.cat((xy, wh, y[..., 4:]), -1)
@meng1994412 @glemarivero @pocketpixels to clarify, all modules including the Detect() layer are exported by export.py, no modules are missing. The --train
flag simply places the model in model.train() mode, which allows the Detect() layer to sidestep the grid and concatenation ops.
https://github.com/ultralytics/yolov5/blob/251aeafcb16ebc4c9d9a6641b3677aaac2f2d2cb/models/export.py#L57-L58
but if you do how do you continue? how do you get the final bounding boxes?
In case anyone is interested, I put together a script to output a CoreML .mlmodel that can be opened with XCode (the previous model wasn't), and can be used to preview inference results inside it. Again, I only did it for yolov5s.
python models/export.py --train
Thanks for sharing @glemarivero.
I also wrote a similar (but different) CoreML export script that generates a CoreML model that can be previewed in Xcode and can easily be used with Apple's Vision framework and yields a VNRecognizedObjectObservation
for each detected object.
I modified the code in the Detect module similar to what I discussed above (but there was still a missing step) so that it can be exported by the coremltools convert function.
It should work for all the differently sized variants of the Yolo v5 model.
To try it I recommend checking out the branch from my forked repo into a separate directory:
git clone -b better_coreml_export https://github.com/pocketpixels/yolov5.git yolov5_coreml_export
From within that directory use it with
python models/coreml_export.py --weights [model weights file]
Nice work @pocketpixels! Thanks for sharing 🙂
🐛 Bug
I am trying to export the default trained YOLOv5 Model as given here to CoreML but getting an error on both Colab as well as my laptop:
CoreML: export failure: 'torch._C.Node' object has no attribute 'ival'
To Reproduce (REQUIRED)
Follow the steps mentioned here.
Expected behavior
Export the CoreML model successfully.
Environment
Colab and Macbook Pro 13 inch 2019.