mert-kurttutan / torchview

torchview: visualize pytorch models
https://torchview.dev
MIT License
822 stars 37 forks source link

Unexpected Error for YOLOv8m from Ultralytics library #63

Closed kadirnar closed 1 year ago

kadirnar commented 1 year ago

Hi @mert-kurttutan,

I want to visualize the Yolo models. The .yaml files are used for the architecture. Can we visualize using only the weight file (.pt)? Yolov5s: https://github.com/ultralytics/yolov5/blob/master/models/yolov5s.yaml Example: https://github.com/lutzroeder/netron

mert-kurttutan commented 1 year ago

Is it possible load your model from its file and then visualize the loaded model ? If the answer is yes, I think adding this feature would not be necessary, I think.

If your model is jit-traced model, this is another question.

kadirnar commented 1 year ago

You can make visualizations by just uploading the model file to the netron web application. I want this support for Torchview library as well.

Yolov5s Netron App: image

mert-kurttutan commented 1 year ago

As I said, loading model from file would just take 1 extra line of code. Is there something that prevents you from loading from file?

mert-kurttutan commented 1 year ago

Or if you can think of a case where you cannot load model, this would be just as valid

kadirnar commented 1 year ago

I don't know how to do it. Can you show an example usage?

model_path = 'yolov6s.pt'
???

When I use the load_model code it gives the following error: Code:

from torchview import draw_graph
from yolov6 import YOLOV6

model_path = 'yolov6s.pt'
model = YOLOV6(weights=model_path, device='cuda:0') # load model
draw_graph(model, torch.randn(1, 3, 640, 640))   

Output: AssertionError: Module Node DetectBackend has no outputs module,hence must not have node of Module Node type Conv2d

mert-kurttutan commented 1 year ago

I am not familiar with YOLOV6 api. But, in pytorch, there are 2 main ways to load: 1) torch.load 2)

model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))

depending on the way you saved the model, for more info, see here. Can you give a link to the this model file so I can try it?

kadirnar commented 1 year ago

You can download and try it from here. https://github.com/kadirnar/torchyolo/releases/tag/v0.0.1

Yolov5-Pip:

from torchview import draw_graph
import yolov5

model = yolov5.load('yolov5s.pt')
model_graph = draw_graph(model, 
                            input_size=(1, 3, 352, 352), 
                            expand_nested=True, 
                            depth=3,
                        )
model_graph.visual_graph.view()

Output: https://drive.google.com/drive/folders/1xPlyE1d1-aVQaMuRpgqjRp3BVWBYR7R0?usp=share_link Example Architecture: https://github.com/iloveai8086/YOLOC#yolov5-1

The results aren't very good :/

mert-kurttutan commented 1 year ago

Seemed fine to me. Can you be more precise than 'not very good' :) ?

mert-kurttutan commented 1 year ago

If you want the graph to look like those in the link, they seem too custom. I cannot think of any algorithmic way (that would not disturb other types of networks) and that can lead to those in the link.

Also, some of them seem to ignore some actual pytorch operations (e.g. type casting or integer addition)

There is also an option to change direction of graph, i.e. vertical or horizontal, if this is useful.

kadirnar commented 1 year ago

I tried the graph_dir parameter. But it is not successful. I would say the default value is better. But it's a great library for understanding architecture. I would like to add the torchview algorithm to the torchyolo library if you can help support it for other yolo models.

mert-kurttutan commented 1 year ago

Yeah sure. Just mention me in an issue of torchyolo repo, I think we can start from there.

kadirnar commented 1 year ago

I solved the error for other yolo models. But I need help for yolov7 model. https://github.com/kadirnar/yolov7-pip/blob/main/yolov7/helpers.py#L23

Code:

from torchview import draw_graph
import yolov7

model = yolov7.load(model_path='yolov7.pt')
print(model)
model_graph = draw_graph(model, 
                            input_size=(1, 3, 352, 352), 
                            expand_nested=True, 
                            depth=3,
                        )
model_graph.visual_graph.view()

Output:

Fusing layers... 
 Convert model to Traced-model... 
 traced_script_module saved! 
 model is traced! 

autoShape(
  (model): TracedModel(
    (model): Model(
      original_name=Model
      (model): Sequential(
        original_name=Sequential
        (0): Conv(
          original_name=Conv
          (conv): Conv2d(original_name=Conv2d)
          (bn): BatchNorm2d(original_name=BatchNorm2d)
          (act): SiLU(original_name=SiLU)
        )
        (1): Conv(
          original_name=Conv
          (conv): Conv2d(original_name=Conv2d)
          (bn): BatchNorm2d(original_name=BatchNorm2d)
          (act): SiLU(original_name=SiLU)
        )
....
AttributeError: 'torch._C.ScriptMethod' object has no attribute '__name__'
mert-kurttutan commented 1 year ago

As of now, torchview does not work on jit-traced modules. I can add small piece of code to not see this error, but it will show the entire traced module as just one function, not showing its internal module structure.

Though, there is a solution to make torchview work for traced module and show its internal module structure. This would involve using graph structure of traced module and inserting it into torchview's graph structure. You are welcome to do this, see the new issue here #66 .

At the moment, I don't want to pursue it myself since torchdynamo seems to be the future of getting traced module and much more capable than torchscript (e.g. capturing data-dependent flow, non-tensor input). I will rather wait and include support torchdynamo models.

kadirnar commented 1 year ago

Code:

from torchview import draw_graph
import yolov7
model = yolov7.load(model_path='yolov7.pt', trace=False)
model_graph = draw_graph(model, 
                            input_size=(1, 3, 352, 352), 
                            expand_nested=True, 
                            depth=3,
                        )
model_graph.visual_graph.view()

Output:

assert id(r_in) == r_in.tensor_nodes[-2].tensor_id, (
IndexError: list index out of range

I set the trace parameter to false. What could be the cause of this error?

kadirnar commented 1 year ago
from ultralytics import YOLO
from torchview import draw_graph

model = YOLO('yolov8m.pt')
new_model = model.model
model_graph = draw_graph(new_model, 
                            input_size=(1, 3, 352, 352), 
                            expand_nested=True, 
                            depth=3,
                        )
model_graph.visual_graph.view()

I am getting the same error in Yolov8 model. We can add the torchview repo to the ultralytics repo.

mert-kurttutan commented 1 year ago

The problem is most probably solved. I tested it with YOLOV8m. The version with solution is in branch fix-empty-output. You can use this branch for the cases above.

Note: With those assertion statements and pop expression below, I was actually being overprotective. If you just delete them, it would also work fine but in a less controllable way.

I will add a test case to make sure this case is 100 % bug free and publish a new release.

kadirnar commented 1 year ago

I have tested. It works great. I think we should add torchview library to Yolov8 repo. What do you think about this subject?

mert-kurttutan commented 1 year ago

I agree. To be sure, which repo are you referring to on this feature?

kadirnar commented 1 year ago

The Yolov8 model has just been released. Developed by Yolov5's team of authors. It's very popular right now.

https://github.com/ultralytics/ultralytics

mert-kurttutan commented 1 year ago

I think we can open a issue/PR where we showcase the visualization feature and wait for a feedback from their team.

If you agree, how do you think we should demonstrate the feature.

I would say small piece python script using torchview and ultralytics without any integration would be good enough. If they are convinced with this demo, then we can start with the actual branch for this feature

mert-kurttutan commented 1 year ago

Just as a side note, I would also try to keep torchview dependency optional if/when we integrate this feature

kadirnar commented 1 year ago

I think you should write directly to the ultralytics team. You can have a look at its integration with other libraries.

https://github.com/ultralytics/ultralytics#integrations

mert-kurttutan commented 1 year ago

Just for documentation purposes, the main cause of this error is the fact that the Detect submodule uses list of torch.Tensor, which is mutable, and changes it inplace inside forward of Detect submodule. This leads to some mix-up when recording tensor information, now fixed and added test case for it.

kadirnar commented 1 year ago

https://github.com/kadirnar/torchyolo/pull/34

Thank you. You are great. I want to create a readme.md file for Torchview. I need to prepare a beautiful image. Can you help with this?( You can close this issue.)

kadirnar commented 1 year ago

The problem is most probably solved. I tested it with YOLOV8m. The version with solution is in branch fix-empty-output. You can use this branch for the cases above.

Note: With those assertion statements and pop expression below, I was actually being overprotective. If you just delete them, it would also work fine but in a less controllable way.

I will add a test case to make sure this case is 100 % bug free and publish a new release.

Doesn't work on large models. Can you test it? Large Model:

mert-kurttutan commented 1 year ago

Both 'yolov5x6' and 'yolov5l6' work fine for me. Are you sure you have the latest version? E.g. yolovl6: model gv (3)

kadirnar commented 1 year ago

yolov5 : 7.0.7 torchview : 0.2.4

from torchview import draw_graph
import yolov5

model_arch = yolov5.load("yolov5s6.pt")
model_graph = draw_graph(
    model_arch,
    input_size=(1, 3, 352, 352),
    expand_nested=True,
    depth=3,
)

model_graph.visual_graph.render(format='pdf')

Output:

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 12 but got size 11 for tensor number 1 in the list.
mert-kurttutan commented 1 year ago

Info message from ultralytics API:


Downloading: "https://github.com/ultralytics/yolov5/zipball/master" to /root/.cache/torch/hub/master.zip
YOLOv5 🚀 v7.0-69-g3b6e27a Python-3.8.16 torch-1.13.0+cu116 CUDA:0 (Tesla T4, 15110MiB)

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s6.pt to yolov5s6.pt...
100%
24.8M/24.8M [00:01<00:00, 12.7MB/s]

Fusing layers... 
YOLOv5s6 summary: 280 layers, 12612508 parameters, 0 gradients
Adding AutoShape... 
image 1/1: 720x1280 2 persons, 1 tie
Speed: 450.5ms pre-process, 15.1ms inference, 1.5ms NMS per image at shape (1, 3, 384, 640)

So, I think one should use input of shape (1,3,384,640), then it should work, which it does in my case.