datajoint / datajoint-python

Relational data pipelines for the science lab
https://datajoint.com/docs
GNU Lesser General Public License v2.1
165 stars 84 forks source link

dj.Diagram raises ValueError if it was installed from Conda (but works if it was installed from PyPI) #1065

Open felixsmli opened 1 year ago

felixsmli commented 1 year ago

Bug Report

Description

When trying to plot the schema diagram with dj.Diagram(schema), it raises an exception:

ValueError: Node names and attributes should not contain ":" unless they are quoted with "". For example the string 'attribute:data1' should be written as '"attribute:data1"'. Please refer https://github.com/pydot/pydot/issues/258

Reproducibility

Include:

Use it to run a Jupyter Notebook and try to plot the diagram of a schema with dj.Diagram(schema) , it shows the ValueError. I know the diagram function depends on Graphviz, which was installed automatically while installing DataJoint:

conda list | grep graphviz
graphviz                  6.0.1                h5abf519_0    conda-forge

Also I have tried:

  1. Install Graphviz to the system with APT (For Ubuntu 22.04 it gets graphviz/now 2.42.2-6 amd64) before running conda install, conda still installs its own Graphviz.
  2. Install Graphviz 2.42.3 with conda, then install DataJoint.

Same issue persists.

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:440, in Diagram._reprsvg(self) 439 def _reprsvg(self): --> 440 return self.make_svg()._reprsvg()

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:428, in Diagram.make_svg(self) 425 def make_svg(self): 426 from IPython.display import SVG --> 428 return SVG(self.make_dot().create_svg())

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:373, in Diagram.make_dot(self) 310 label_props = { # http://matplotlib.org/examples/color/named_colors.html 311 None: dict( 312 shape="circle", (...) 366 ), 367 } 368 node_props = { 369 node: label_props[d["node_type"]] 370 for node, d in dict(graph.nodes(data=True)).items() 371 } --> 373 dot = nx.drawing.nx_pydot.to_pydot(graph) 374 for node in dot.get_nodes(): 375 node.set_shape("circle")

File /opt/conda/lib/python3.10/site-packages/networkx/drawing/nx_pydot.py:309, in to_pydot(N) 298 raise_error = ( 299 _check_colon_quotes(u) 300 or _check_colon_quotes(v) (...) 306 ) 307 ) 308 if raise_error: --> 309 raise ValueError( 310 f'Node names and attributes should not contain ":" unless they are quoted with "".\ 311 For example the string \'attribute:data1\' should be written as \'"attribute:data1"\'.\ 312 Please refer https://github.com/pydot/pydot/issues/258' 313 ) 314 edge = pydot.Edge(u, v, **str_edgedata) 315 P.add_edge(edge)

ValueError: Node names and attributes should not contain ":" unless they are quoted with "". For example the string 'attribute:data1' should be written as '"attribute:data1"'. Please refer https://github.com/pydot/pydot/issues/258


### Expected Behavior

The schema diagram can be plotted correctly. 

### Additional Research and Context

Actually if you install DataJoint with `pip install`, with system Graphviz from APT, the diagram function works properly. For verifying this I actually created two docker images to test (the above one installs DataJoint with `conda` and the following one installs it with `pip`). The pip Dockerfile works as expected, but the Conda one does not. 

```Dockerfile
FROM jupyter/scipy-notebook:ubuntu-22.04
USER root
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -yq \
    graphviz \
    && rm -rf /var/lib/apt/lists/*
USER jovyan
RUN python -m pip install --no-cache-dir \
    datajoint

Another difference I notice is that conda gets DataJoint 0.13.7 while pip gets DataJoint 0.13.8. So I have tried pip install datajoint 0.13.7, the diagram still works.

JensBlack commented 1 year ago

I can confirm the error, however it only appeared for me when I created a table with a dependency. Dropping the related table resolved the issue.

datajoint 0.13.7 pyhd8ed1ab_0 conda-forge graphviz 2.50.0 hdb8b0d4_0

@schema
class Mouse(dj.Manual):
    definition = """
    # Experimental animals
    mouse_id             : int                          # Unique animal ID
    ---
    dob=null             : date                         # date of birth
    sex="unknown"        : enum('M','F','unknown')      # sex
    """

@schema
class Experimenter(dj.Manual):
        definition = """
    # Experimenter
    experimenter_id      : int                          # Unique experimenter ID
    ---
    name=null            : varchar(255)                 # name of experimenter
    sex="unknown"        : enum('M','F','unknown')      # sex
    """

@schema
class Model(dj.Manual):
            definition = """
    # Model info
    model_id      : int                        # Unique model ID
    ---
    name           : varchar(255)              # name of model
    type           : varchar(255)              # model architecture
    training_date  : date                      # date the model was trained
    description    : varchar(255)              # description of the model (optional)
    """

Works as intended. grafik

But adding another table (with dependency) results in the same error.

@schema
class Session(dj.Manual):
    definition = """
    #Session
    -> Mouse
    session_id           : int                         # id of experiment
    ---
    session_time         : time                         # time of experiment #todo: change to datetime
    experimenter_id      : int              # id of experimenter, linking to experimenter table
    video_path           : varchar(255)                 # path to video file
    pose_path            : varchar(255)                 # path to pose file
    pose_origin          : varchar(255)              # origin of pose estimation (e.g. SLEAP)
    annotation_path      : varchar(255)                 # path to annotation file
    annotation_origin    : varchar(255)        # origin of annotation files (e.g. BORIS)
    """
ValueError                                Traceback (most recent call last)
File ~\anaconda3\envs\datajoint_test\lib\site-packages\IPython\core\formatters.py:344, in BaseFormatter.__call__(self, obj)
    342     method = get_real_method(obj, self.print_method)
    343     if method is not None:
--> 344         return method()
    345     return None
    346 else:

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:440, in Diagram._repr_svg_(self)
    439 def _repr_svg_(self):
--> 440     return self.make_svg()._repr_svg_()

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:428, in Diagram.make_svg(self)
    425 def make_svg(self):
    426     from IPython.display import SVG
--> 428     return SVG(self.make_dot().create_svg())

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:373, in Diagram.make_dot(self)
    310 label_props = {  # http://matplotlib.org/examples/color/named_colors.html
    311     None: dict(
    312         shape="circle",
   (...)
    366     ),
    367 }
    368 node_props = {
    369     node: label_props[d["node_type"]]
    370     for node, d in dict(graph.nodes(data=True)).items()
    371 }
--> 373 dot = nx.drawing.nx_pydot.to_pydot(graph)
    374 for node in dot.get_nodes():
    375     node.set_shape("circle")

File ~\anaconda3\envs\datajoint_test\lib\site-packages\networkx\drawing\nx_pydot.py:309, in to_pydot(N)
    298 raise_error = (
    299     _check_colon_quotes(u)
    300     or _check_colon_quotes(v)
   (...)
    306     )
    307 )
    308 if raise_error:
--> 309     raise ValueError(
    310         f'Node names and attributes should not contain ":" unless they are quoted with "".\
    311         For example the string \'attribute:data1\' should be written as \'"attribute:data1"\'.\
    312         Please refer https://github.com/pydot/pydot/issues/258'
    313     )
    314 edge = pydot.Edge(u, v, **str_edgedata)
    315 P.add_edge(edge)

ValueError: Node names and attributes should not contain ":" unless they are quoted with "".                    For example the string 'attribute:data1' should be written as '"attribute:data1"'.                    Please refer https://github.com/pydot/pydot/issues/258

The tables work as intended as far as I was able to check.


Installing datajoint again with pip install datajoint fixes the issue.

(datajoint_test) C:\Users\JSchw\PycharmProjects\Datajoint_test>pip install datajoint
Requirement already satisfied: datajoint in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (0.13.7)
Requirement already satisfied: tqdm in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (4.65.0)
Requirement already satisfied: matplotlib in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.4.3)
Requirement already satisfied: otumat in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (0.3.1)
Requirement already satisfied: cryptography in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.4.8)
Collecting networkx<=2.6.3
  Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 6.8 MB/s eta 0:00:00
Requirement already satisfied: pyparsing in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.0.9)
Requirement already satisfied: pydot in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.4.2)
Requirement already satisfied: pandas in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.5.3)
Requirement already satisfied: numpy in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.22.3)
Requirement already satisfied: minio>=7.0.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (7.1.14)
Requirement already satisfied: ipython in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (8.12.0)
Requirement already satisfied: urllib3 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.26.15)
Requirement already satisfied: pymysql>=0.7.2 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.0.3)
Requirement already satisfied: certifi in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from minio>=7.0.0->datajoint) (2022.12.7)
Requirement already satisfied: cffi>=1.12 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from cryptography->datajoint) (1.15.1)
Requirement already satisfied: stack-data in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.6.2)
Requirement already satisfied: typing-extensions in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (4.5.0)
Requirement already satisfied: matplotlib-inline in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.1.6)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (3.0.38)
Requirement already satisfied: traitlets>=5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (5.9.0)
Requirement already satisfied: pickleshare in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.7.5)
Requirement already satisfied: backcall in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.2.0)
Requirement already satisfied: decorator in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (5.1.1)
Requirement already satisfied: colorama in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.4.6)
Requirement already satisfied: pygments>=2.4.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (2.15.0)
Requirement already satisfied: jedi>=0.16 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.18.2)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (2.8.2)
Requirement already satisfied: cycler>=0.10 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (9.4.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (1.4.4)
Requirement already satisfied: flask in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (2.2.3)
Requirement already satisfied: watchdog in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (3.0.0)
Requirement already satisfied: appdirs in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (1.4.4)
Requirement already satisfied: pytz>=2020.1 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from pandas->datajoint) (2023.3)
Requirement already satisfied: pycparser in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from cffi>=1.12->cryptography->datajoint) (2.21)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from jedi>=0.16->ipython->datajoint) (0.8.3)
Requirement already satisfied: wcwidth in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython->datajoint) (0.2.6)
Requirement already satisfied: six>=1.5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from python-dateutil>=2.7->matplotlib->datajoint) (1.16.0)   
Requirement already satisfied: itsdangerous>=2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (2.1.2)
Requirement already satisfied: click>=8.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (8.1.3)
Requirement already satisfied: Werkzeug>=2.2.2 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (2.2.3)
Requirement already satisfied: Jinja2>=3.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (3.1.2)
Requirement already satisfied: importlib-metadata>=3.6.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (6.4.1)      
Requirement already satisfied: asttokens>=2.1.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (2.2.1)
Requirement already satisfied: executing>=1.2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (1.2.0)
Requirement already satisfied: pure-eval in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (0.2.2)
Requirement already satisfied: zipp>=0.5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from importlib-metadata>=3.6.0->flask->otumat->datajoint) (3.15.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from Jinja2>=3.0->flask->otumat->datajoint) (2.1.1)
Installing collected packages: networkx
  Attempting uninstall: networkx
    Found existing installation: networkx 3.1
    Uninstalling networkx-3.1:
      Successfully uninstalled networkx-3.1
Successfully installed networkx-2.6.3

Minor Edit:

Did a clean install. this time with only pip install datajoint, now I am running into this issue #1033.

troselab-setup commented 1 year ago

Running into the same issue after upgrading to 14.1 with pip3 install --upgrade datajoint, Jens.

dimitri-yatsenko commented 1 year ago

I ran into the same error. It's related to breaking change in networkx. On it.

troselab-setup commented 1 year ago

Great! thx!

noahpettit commented 10 months ago

Hi, have there been any solutions for this issue? I just did fresh install of datajoint on macOS 13.6 and I am trying out the shapes schema example and getting this same error. Note this is with a conda install, not pip. Should I reinstall with pip? Is there an alternative way of plotting the ERD?

dimitri-yatsenko commented 10 months ago

I will prioritize this to fix before our Harvard workshop. See you there. The current fix is to downgrade networkx.

noahpettit commented 10 months ago

Thanks Dimitri. I was able to fix it through pip upgrading datajoint to 0.14.1, vs conda-forge version which looks to be 0.13.7 (Thanks Tobias Rose for the tip). Testing with python 3.9.18. I wasn't able to test on python>=3.10 because of a separate unrelated issue. Looking forward to the workshop, see you there!

simon-ball commented 10 months ago

I will prioritize this to fix before our Harvard workshop. See you there. The current fix is to downgrade networkx.

For reference, networkx is already a somewhat sticky transitive dependency due to incompatible version constraints with even fairly old versions of scikit-image. datajoint, as of 0.14.1, still requires networkx<2.6.3, while scikit-image>=0.20 requires networkx>=2.8.

CBroz1 commented 5 months ago

To be explicit, downgrading networkx to 2.6.2 worked in my case

dimitri-yatsenko commented 5 months ago

Yes, we understand this backward incompatibly of networkx. Fix is coming.