`plot_model` does not work for all models in `keras.applications`

Star9daisy commented 4 months ago

Hi developers, I follow the tutorial of transfer learning here. After that, I use keras.utils.plot_model to show its structure. However, I find that it doesn't work for all provided models.

Here's a colab notebook I create to show each case and the error message: https://colab.research.google.com/drive/1nTYEHGSXWixq6Bx_ubtgP12NOdXdVWzN?usp=sharing

sachinprasadhs commented 4 months ago

I'm able to plot the model, you can set the dpi argument in plot_model to smaller number to fit into the frame.

Attached Gist here for reference https://colab.sandbox.google.com/gist/sachinprasadhs/79afe3e25aec48fa350700955ead74f0/transfer_learning.ipynb

Star9daisy commented 4 months ago

Here's another related issue: https://github.com/tensorflow/tensorflow/issues/65331#issuecomment-2055244725

grasskin commented 3 months ago

Hi @Star9daisy, can you confirm we can still reproduce the error messages after setting dpi? It sounds like the graphs do not fit and are overflowing.

Star9daisy commented 3 months ago

Hi @grasskin, thanks for your attention. I have tried setting the dpi to 100, 50 (200 by default). But the errors are the same. So I checked the error messages and found that there are some syntax error, for example:

Program terminated with status: 1. stderr follows: Error: /tmp/tmpm7fszkqu: syntax error in line 8 near '-'

After tracing back to the plot_model source, I found that this error is due to the way the pydot_ng, pydotplus and pydot handle the name of subgraph:


# pydot
# https://github.com/pydot/pydot/blob/4c1d710bd99efa8ceeb21defc21a1781fb43912e/src/pydot/core.py#L1588
self.obj_dict["name"] = quote_if_necessary("cluster_" + graph_name)

# pydot_ng
# https://github.com/pydot/pydot-ng/blob/16f39800b6f5dc28d291a4d7763bbac04b9efe72/pydot_ng/__init__.py#L1632
self.obj_dict['name'] = 'cluster_' + graph_name

# pydotplus
# https://github.com/carlos-jenkins/pydotplus/blob/e06552ea3989bfb998ef6920a07d6c966ff9b01b/lib/pydotplus/graphviz.py#L1765
self.obj_dict['name'] = 'cluster_' + graph_name

The latter two packages do not quote the name of subgraph, which causes the error. Considering that pydot_ng has been archived on Nov 20, 2018 and the last commit of pydotplus was on Dec 9, 2014, I think it is better to use pydot as the default package for plotting the model.

Also, I have tested the following cases, 'cause I highly doubt that the error is due to a bug of graphviz

Name	Colab: default	Colab: pydot	Colab: conda + pydot	Local: conda + pydot
xception	❌	❌	✅	✅
efficientnetb0	✅	✅	❌	✅
efficientnetb1	✅	✅	✅	✅
efficientnetb2	✅	✅	✅	✅
efficientnetb3	✅	✅	✅	✅
efficientnetb4	❌	❌	❌	❌
efficientnetb5	✅	✅	✅	✅
efficientnetb6	❌	❌	❌	❌
efficientnetb7	❌	❌	✅	✅
efficientnetv2-b0	❌	✅	✅	✅
efficientnetv2-b1	❌	✅	✅	✅
efficientnetv2-b2	❌	✅	✅	✅
efficientnetv2-b3	❌	❌	❌	❌
efficientnetv2-s	❌	❌	❌	❌
efficientnetv2-m	❌	❌	✅	❌
efficientnetv2-l	❌	❌	✅	✅
convnext_tiny	✅	✅	✅	✅
convnext_small	✅	✅	✅	✅
convnext_base	✅	✅	✅	✅
convnext_large	✅	✅	✅	✅
convnext_xlarge	✅	✅	✅	✅
vgg16	✅	✅	✅	✅
vgg19	✅	✅	✅	✅
resnet50	✅	✅	❌	❌
resnet101	✅	✅	✅	✅
resnet152	✅	✅	✅	✅
resnet50v2	✅	✅	✅	✅
resnet101v2	✅	✅	✅	✅
resnet152v2	✅	✅	✅	✅
mobilenet_1.00_224	❌	✅	✅	✅
mobilenetv2_1.00_224	❌	✅	✅	✅
MobilenetV3small	❌	❌	✅	✅
MobilenetV3large	✅	✅	✅	✅
densenet121	✅	✅	✅	✅
densenet169	✅	✅	❌	✅
densenet201	✅	✅	❌	✅
NASNet	✅	✅	✅	✅
NASNet	✅	✅	✅	✅
inception_v3	✅	✅	✅	✅
inception_resnet_v2	❌	❌	✅	✅

The colab lines for the first three columns are as follows:

You can see that using pydot as the default package for plotting the model can solve parts of the problem. And the graphviz(2.50.0 (0)) (2.43.0 (0) by default) installed by conda can also solve some of the problems. But it not always works. Even I got different results on my local test environment (Ubuntu 22.04 docker image).

The existing errors are here:

dot: maze.c:313: chkSgraph: Assertion `np->cells[1]' failed. (in Colab: pydot)
newtrap: Trapezoid-table overflow 8261 (in Colab: conda + pydot)

Here are already closed issues related to the error:

[DOT] failed at node 0[1] dot: maze.c:315: chkSgraph: Assertion `np->cells[1]' failed.

The problem is with floating point comparison. -- from Costa Shulyupin
Generating a graph with ortho splines fails with Trapezoid-table overflow

The ortho library estimated the number of trapezoid structures it would need upfront based on the number of segments it was operating on. This estimation was wrong. Some inputs could exceed the estimation, at which point Graphviz would abort with an error message. -- from the commit 8f605841

However, I could still reproduce these errors on my local environment. So I think the errors shown in the Colab by calling plot_model are due to the bug of graphviz , which are the same as the ones in the closed issues.

keras-team / keras

`plot_model` does not work for all models in `keras.applications` #19717