keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.66k stars 19.42k forks source link

`plot_model` does not work for all models in `keras.applications` #19717

Open Star9daisy opened 4 months ago

Star9daisy commented 4 months ago

Hi developers, I follow the tutorial of transfer learning here. After that, I use keras.utils.plot_model to show its structure. However, I find that it doesn't work for all provided models.

Here's a colab notebook I create to show each case and the error message: https://colab.research.google.com/drive/1nTYEHGSXWixq6Bx_ubtgP12NOdXdVWzN?usp=sharing

sachinprasadhs commented 4 months ago

I'm able to plot the model, you can set the dpi argument in plot_model to smaller number to fit into the frame.

Attached Gist here for reference https://colab.sandbox.google.com/gist/sachinprasadhs/79afe3e25aec48fa350700955ead74f0/transfer_learning.ipynb

Star9daisy commented 4 months ago

Here's another related issue: https://github.com/tensorflow/tensorflow/issues/65331#issuecomment-2055244725

grasskin commented 3 months ago

Hi @Star9daisy, can you confirm we can still reproduce the error messages after setting dpi? It sounds like the graphs do not fit and are overflowing.

Star9daisy commented 3 months ago

Hi @grasskin, thanks for your attention. I have tried setting the dpi to 100, 50 (200 by default). But the errors are the same. So I checked the error messages and found that there are some syntax error, for example:

Program terminated with status: 1. stderr follows: Error: /tmp/tmpm7fszkqu: syntax error in line 8 near '-'

After tracing back to the plot_model source, I found that this error is due to the way the pydot_ng, pydotplus and pydot handle the name of subgraph:


# pydot
# https://github.com/pydot/pydot/blob/4c1d710bd99efa8ceeb21defc21a1781fb43912e/src/pydot/core.py#L1588
self.obj_dict["name"] = quote_if_necessary("cluster_" + graph_name)

# pydot_ng
# https://github.com/pydot/pydot-ng/blob/16f39800b6f5dc28d291a4d7763bbac04b9efe72/pydot_ng/__init__.py#L1632
self.obj_dict['name'] = 'cluster_' + graph_name

# pydotplus
# https://github.com/carlos-jenkins/pydotplus/blob/e06552ea3989bfb998ef6920a07d6c966ff9b01b/lib/pydotplus/graphviz.py#L1765
self.obj_dict['name'] = 'cluster_' + graph_name

The latter two packages do not quote the name of subgraph, which causes the error. Considering that pydot_ng has been archived on Nov 20, 2018 and the last commit of pydotplus was on Dec 9, 2014, I think it is better to use pydot as the default package for plotting the model.

Also, I have tested the following cases, 'cause I highly doubt that the error is due to a bug of graphviz

Name Colab: default Colab: pydot Colab: conda + pydot Local: conda + pydot
xception
efficientnetb0
efficientnetb1
efficientnetb2
efficientnetb3
efficientnetb4
efficientnetb5
efficientnetb6
efficientnetb7
efficientnetv2-b0
efficientnetv2-b1
efficientnetv2-b2
efficientnetv2-b3
efficientnetv2-s
efficientnetv2-m
efficientnetv2-l
convnext_tiny
convnext_small
convnext_base
convnext_large
convnext_xlarge
vgg16
vgg19
resnet50
resnet101
resnet152
resnet50v2
resnet101v2
resnet152v2
mobilenet_1.00_224
mobilenetv2_1.00_224
MobilenetV3small
MobilenetV3large
densenet121
densenet169
densenet201
NASNet
NASNet
inception_v3
inception_resnet_v2

The colab lines for the first three columns are as follows:

  1. https://colab.research.google.com/drive/1nTYEHGSXWixq6Bx_ubtgP12NOdXdVWzN?usp=sharing
  2. https://colab.research.google.com/drive/1cQ3chsj4JDHVwo9Ts5UcRJP2puMFAGEK?usp=sharing
  3. https://colab.research.google.com/drive/1sKUWhkhNgrxuHKmKwZ98rtFZZPZweO__?usp=sharing

You can see that using pydot as the default package for plotting the model can solve parts of the problem. And the graphviz(2.50.0 (0)) (2.43.0 (0) by default) installed by conda can also solve some of the problems. But it not always works. Even I got different results on my local test environment (Ubuntu 22.04 docker image).

The existing errors are here:

  1. dot: maze.c:313: chkSgraph: Assertion `np->cells[1]' failed. (in Colab: pydot)
  2. newtrap: Trapezoid-table overflow 8261 (in Colab: conda + pydot)

Here are already closed issues related to the error:

  1. [DOT] failed at node 0[1] dot: maze.c:315: chkSgraph: Assertion `np->cells[1]' failed.

    The problem is with floating point comparison. -- from Costa Shulyupin

  2. Generating a graph with ortho splines fails with Trapezoid-table overflow

    The ortho library estimated the number of trapezoid structures it would need upfront based on the number of segments it was operating on. This estimation was wrong. Some inputs could exceed the estimation, at which point Graphviz would abort with an error message. -- from the commit 8f605841

However, I could still reproduce these errors on my local environment. So I think the errors shown in the Colab by calling plot_model are due to the bug of graphviz , which are the same as the ones in the closed issues.