kedro-org / kedro-viz

Visualise your Kedro data and machine-learning pipelines and track your experiments.
https://demo.kedro.org
Apache License 2.0
667 stars 110 forks source link

Kedro-viz displaying empty pipeline visualization #754

Closed scarvajalg closed 2 years ago

scarvajalg commented 2 years ago

Description

When I run kedro-viz from the command line, I get an error. I work with pyspark and python

Context

Can't visualize kedro pipelines and datasets graph

Steps to Reproduce

I tried with two different kedro versions 4.0.0 and 4.3.1

Expected Result

Visualize the pipelines graph in the browser

Actual Result

I get the following errors

importlib; see the module's documentation for alternative uses
  from imp import load_source
Traceback (most recent call last):
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3250, in _wrap_pool_connect
    return fn()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 310, in connect
    return _ConnectionFairy._checkout(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 868, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 476, in checkout
    rec = pool._do_get()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 256, in _do_get
    return self._create_connection()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 256, in _create_connection
    return _ConnectionRecord(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 371, in __init__
    self.__connect()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 666, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
    self.dbapi_connection = connection = pool._invoke_creator(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 590, in connect
    return dialect.connect(*cargs, **cparams)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 597, in connect
    return self.dbapi.connect(*cargs, **cparams)
sqlite3.OperationalError: unable to open database file
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/kedro_viz/launchers/cli.py", line 129, in viz
    run_server(**run_server_kwargs)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/kedro_viz/server.py", line 110, in run_server
    populate_data(data_access_manager, catalog, pipelines, session_store_location)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/kedro_viz/server.py", line 65, in populate_data
    Base.metadata.create_all(bind=database_engine)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/sql/schema.py", line 4785, in create_all
    bind._run_ddl_visitor(
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3116, in _run_ddl_visitor
    with self.begin() as conn:
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3032, in begin
    conn = self.connect(close_with_result=close_with_result)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3204, in connect
    return self._connection_cls(self, close_with_result=close_with_result)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 96, in __init__
    else engine.raw_connection()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3283, in raw_connection
    return self._wrap_pool_connect(self.pool.connect, _connection)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3253, in _wrap_pool_connect
    Connection._handle_dbapi_exception_noconnection(
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2100, in _handle_dbapi_exception_noconnection
    util.raise_(
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3250, in _wrap_pool_connect
    return fn()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 310, in connect
    return _ConnectionFairy._checkout(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 868, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 476, in checkout
    rec = pool._do_get()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 256, in _do_get
    return self._create_connection()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 256, in _create_connection
    return _ConnectionRecord(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 371, in __init__
    self.__connect()
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 666, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
    self.dbapi_connection = connection = pool._invoke_creator(self)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 590, in connect
    return dialect.connect(*cargs, **cparams)
  File "/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 597, in connect
    return self.dbapi.connect(*cargs, **cparams)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
(Background on this error at: https://sqlalche.me/e/14/e3q8)
kedro.framework.cli.utils.KedroCliError: (sqlite3.OperationalError) unable to open database file
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Run with --verbose to see the full exception
Error: (sqlite3.OperationalError) unable to open database file
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Your Environment

Include as many relevant details as possible about the environment you experienced the bug in:

Checklist

tynandebold commented 2 years ago

Hey @scarvajalg, thank you for reporting this.

Taking a quick look at the error messages, there seems to be something amiss with sqlite, and therefore it could be something to do with experiment tracking. Are you trying to enable that feature in Kedro-Viz?

If so, have you followed the usage instructions in the readme in order to enable it?

scarvajalg commented 2 years ago

Hi!

I fix the error with sqlite. Now kedro-viz works but I can not visualize the pipelines: kedro-viz

Kedro-viz is empty. Why kedro-viz can't recognize the datasets and pipelines?

Here is the command line:

2022-02-23 09:59:39,487 - numexpr.utils - INFO - NumExpr defaulting to 4 threads.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/02/23 09:59:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2022-02-23 10:00:05,973 - kedro_viz.integrations.pypi - INFO - Checking for update...
2022-02-23 10:00:06,580 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/kedro/framework/context/context.py:32: DeprecationWarning: Accessing package_name via the context will be deprecated in Kedro 0.18.0.
  warn(
/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/kedro/framework/context/context.py:488: UserWarning: Credentials not found in your Kedro project config.
No files found in ['/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/conf/base', '/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/conf/local'] matching the glob pattern(s): ['credentials*', 'credentials*/**', '**/credentials*']
  warn(f"Credentials not found in your Kedro project config.\n{str(exc)}")
/Users/scarvajalg/PycharmProjects/data_engineering/venv/lib/python3.8/site-packages/hdfs/config.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  from imp import load_source
/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/src/data_modeling/pipelines/analytics/mod_prop_veh/premaster/nodes.py:229: DeprecationWarning: invalid escape sequence \]
  (F.split(F.split("rango_monto", "[<\]]")[1], ",")[0]).alias("lower_limit"),
/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/src/data_modeling/pipelines/analytics/mod_prop_veh/premaster/nodes.py:230: DeprecationWarning: invalid escape sequence \]
  (F.split(F.split("rango_monto", "[<\]]")[1], ",")[1]).alias("high_limit"),
/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/src/data_modeling/pipelines/analytics/mod_prop_veh/premaster/nodes.py:262: DeprecationWarning: invalid escape sequence \]
  (F.split(F.split("rango_monto", "[<\]]")[1], ",")[0]).alias("lower_limit"),
/Users/scarvajalg/PycharmProjects/data_engineering/data-pipelines/data_modeling/src/data_modeling/pipelines/analytics/mod_prop_veh/premaster/nodes.py:263: DeprecationWarning: invalid escape sequence \]
  (F.split(F.split("rango_monto", "[<\]]")[1], ",")[1]).alias("high_limit"),
INFO:     Started server process [4621]
2022-02-23 10:00:16,828 - uvicorn.error - INFO - Started server process [4621]
INFO:     Waiting for application startup.
2022-02-23 10:00:16,832 - uvicorn.error - INFO - Waiting for application startup.
INFO:     Application startup complete.
2022-02-23 10:00:16,836 - uvicorn.error - INFO - Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:4141 (Press CTRL+C to quit)
2022-02-23 10:00:16,843 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:4141 (Press CTRL+C to quit)
INFO:     127.0.0.1:49466 - "GET / HTTP/1.1" 200 OK
INFO:     127.0.0.1:49466 - "GET /static/css/2.8509df91.chunk.css HTTP/1.1" 200 OK
INFO:     127.0.0.1:49468 - "GET /static/js/2.610eb610.chunk.js HTTP/1.1" 200 OK
INFO:     127.0.0.1:49467 - "GET /static/css/main.680a8f3c.chunk.css HTTP/1.1" 200 OK
INFO:     127.0.0.1:49469 - "GET /static/js/main.bcd1329b.chunk.js HTTP/1.1" 200 OK
INFO:     ('127.0.0.1', 49470) - "WebSocket /graphql" [accepted]
2022-02-23 10:00:19,083 - uvicorn.error - INFO - ('127.0.0.1', 49470) - "WebSocket /graphql" [accepted]
INFO:     127.0.0.1:49468 - "GET /api/main HTTP/1.1" 200 OK
INFO:     127.0.0.1:49471 - "GET /manifest.json HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:49468 - "GET /favicon.ico HTTP/1.1" 404 Not Found
rashidakanchwala commented 2 years ago

Hi @scarvajalg, What happens when you open http://localhost:4141/api/main -- do you see a json with nodes, pipelines. Please do share a screenshot.

scarvajalg commented 2 years ago

Hi! This is the result: Captura de Pantalla 2022-02-28 a la(s) 9 08 20 a m

tynandebold commented 2 years ago

Hi @scarvajalg, thank you for the screenshot!

I observe that you don't seem to have any nodes, edges, or layers in your screenshot. @AntonyMilneQB do you have any thoughts on why these arrays would be empty?

antonymilne commented 2 years ago

@tynandebold I wonder if this is the same issue we fixed in https://github.com/kedro-org/kedro-viz/pull/729 (but isn't released yet).

@scarvajalg what command exactly are you running to start kedro viz? If you're using the --pipeline flag, does it work without that, i.e. just a pure kedro viz?

scarvajalg commented 2 years ago

I executed both kedro viz and kedro viz --pipeline and I got the same result, an empty visualization

antonymilne commented 2 years ago

Hmm, very weird. What is the output of kedro registry list? And what does your pipeline_registry.py look like?

scarvajalg commented 2 years ago

This is the output of kedro registry list Captura de Pantalla 2022-03-09 a la(s) 9 30 08 a m pipeline_registry.py:

def register_pipelines() -> Dict[str, Pipeline]:
    """Register the project's pipelines.

    Returns:
        A mapping from a pipeline name to a ``Pipeline`` object.
    """
    ddv_master = premaster_pipeline() + sunarp_pipeline() + master_pipeline()
    udv_pacifico = (
        bienes_generales_pipeline() +
        persona_pipeline() +
        poliza_pipeline() +
        producto_pipeline() +
        referencia_pipeline() +
        siniestro_pipeline()
    )
    udv_bcp = data_bcp_pipeline()
    return {
        "__default__": Pipeline([]),
        "bienes_generales": bienes_generales_pipeline(),
        "great_expectations": great_expectations_pipeline(),
        "data_bcp": data_bcp_pipeline(),
        "ddv_master": ddv_master,
        "dom_referencia": dom_referencia_pipeline(),
        "master": master_pipeline(),
        "dom_persona": dom_persona_pipeline(),
        "dom_poliza": dom_poliza_pipeline(),
        "persona": persona_pipeline(),
        "poliza": poliza_pipeline(),
        "producto": producto_pipeline(),
        "premaster": premaster_pipeline(),
        "referencia": referencia_pipeline(),
        "siniestro": siniestro_pipeline(),
        "sunarp": sunarp_pipeline(),
        "udv_pacifico": udv_pacifico,
        "udv_bcp": udv_bcp,
        "um_score_buro": um_score_buro_pipeline(),
        "universal": udv_pacifico + udv_bcp,
    }
antonymilne commented 2 years ago

Ah ok, I think the problem is going to be that you have an empty __default__ pipeline. I suspect that this will be fixed when https://github.com/kedro-org/kedro-viz/pull/729 is released.

For the time being, I think the easiest fix is to populate __default__ with something, e.g. "__default__": bienes_generales_pipeline(). Or probably even just a completely fake pipeline like "__default__": Pipeline([node(lambda: None, None, "x")]) should fix it I think.

scarvajalg commented 2 years ago

As you suggest, I populate __default__ with a fake pipeline. Now, when I execute kedro viz I can see just one pipeline (persona pipeline) but I can't visualize the other pipelines or all of them. I tried kedro viz --pipeline to visualize another pipeline, and it's empty. It seems that now kedro viz only recognize persona pipeline.

What I should do to visualize all the pipelines?

antonymilne commented 2 years ago

Hmm, this is weird. Please could you try running the following to install a version of kedro-viz that includes the fix https://github.com/kedro-org/kedro-viz/pull/729.

pip uninstall kedro-viz
pip install https://github.com/kedro-org/kedro-viz/raw/test/main-package/package/dist/kedro_viz-4.3.1-py3-none-any.whl
scarvajalg commented 2 years ago

I tried with the kedro viz version you sent but the result is the same. I only can visualize persona pipeline

scarvajalg commented 2 years ago

I tried kedro viz --pipeline and I can visualize the pipeline I want to. But I can't visualize all the pipelines of the project. My project is big, there is a limit of pipelines or items that kedro viz can create in the visualization? Maybe that's the problem I can't visualize all the pipelines.

tynandebold commented 2 years ago

If possible, could you please create a repo and share the link with us? That'll help us find a solution faster.

tynandebold commented 2 years ago

Hi @scarvajalg. I'm wondering if you're still having issues with this?

scarvajalg commented 2 years ago

Hi, I still have the problem but is not possible to share the project repo with the team. I can visualize each pipeline individually so for now it works for the team I'm working

tynandebold commented 2 years ago

Got it, thank you for the update. Would it be possible for you to create a separate, simplified repository that demos the same problem you're facing?

jw-cpnet commented 2 years ago

I have got the same issue. All my pipelines are empty.

kedro viz --pipeline history
2022-03-22 10:49:34,616 - kedro_viz.integrations.pypi - INFO - Checking for update...
2022-03-22 10:49:35,309 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-03-22 10:49:35,334 - kedro.config.config - INFO - Config from path `/home/jj/Git/etl-projects/conf/local` will override the following existing top-level config keys: machine_input
2022-03-22 10:49:36,191 - kedro.config.config - INFO - Config from path `/home/jj/Git/etl-projects/conf/local` will override the following existing top-level config keys: store_machine
INFO:     Started server process [114880]
2022-03-22 10:49:37,435 - uvicorn.error - INFO - Started server process [114880]
INFO:     Waiting for application startup.
2022-03-22 10:49:37,436 - uvicorn.error - INFO - Waiting for application startup.
INFO:     Application startup complete.
2022-03-22 10:49:37,437 - uvicorn.error - INFO - Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:4141 (Press CTRL+C to quit)
2022-03-22 10:49:37,437 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:4141 (Press CTRL+C to quit)

(google-chrome:115008): Gtk-WARNING **: 10:49:37.672: Theme parsing error: gtk.css:5822:26: '-shadow' is not a valid color name

(google-chrome:115008): Gtk-WARNING **: 10:49:37.673: Theme parsing error: gtk.css:5825:14: not a number

(google-chrome:115008): Gtk-WARNING **: 10:49:37.673: Theme parsing error: gtk.css:5826:13: not a number

(google-chrome:115008): Gtk-WARNING **: 10:49:37.673: Theme parsing error: gtk.css:5827:11: Expected a length
Opening in existing browser session.
INFO:     127.0.0.1:33666 - "GET / HTTP/1.1" 200 OK
[115049:115049:0100/000000.879141:ERROR:sandbox_linux.cc(377)] InitializeSandbox() called with multiple threads in process gpu-process.
INFO:     ('127.0.0.1', 33670) - "WebSocket /graphql" [accepted]
2022-03-22 10:49:38,084 - uvicorn.error - INFO - ('127.0.0.1', 33670) - "WebSocket /graphql" [accepted]
INFO:     127.0.0.1:33666 - "GET /api/main HTTP/1.1" 200 OK
INFO:     127.0.0.1:33666 - "GET /favicon.ico HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:33672 - "GET /manifest.json HTTP/1.1" 404 Not Found
INFO:     ('127.0.0.1', 33674) - "WebSocket /graphql" [accepted]
2022-03-22 10:49:39,240 - uvicorn.error - INFO - ('127.0.0.1', 33674) - "WebSocket /graphql" [accepted]
INFO:     ('127.0.0.1', 33676) - "WebSocket /graphql" [accepted]
2022-03-22 10:49:39,511 - uvicorn.error - INFO - ('127.0.0.1', 33676) - "WebSocket /graphql" [accepted]
INFO:     ('127.0.0.1', 33678) - "WebSocket /graphql" [accepted]
2022-03-22 10:49:40,252 - uvicorn.error - INFO - ('127.0.0.1', 33678) - "WebSocket /graphql" [accepted]
 pip list | grep kedro
kedro                0.17.7
kedro-viz            4.3.1
kedro registry list
- __default__
- api
- erp
- history
- machine
- report
antonymilne commented 2 years ago

Hi @jw-cpnet, please could you try running the following and see if that fixes it? Also, what happens if you run kedro viz with no pipeline argument?

pip uninstall kedro-viz
pip install https://github.com/kedro-org/kedro-viz/raw/test/main-package/package/dist/kedro_viz-4.3.1-py3-none-any.whl
jw-cpnet commented 2 years ago

Thank you! @AntonyMilneQB

The fix works perfectly! And yes, kedro viz (before fixing) without pipeline argument works too.

antonymilne commented 2 years ago

Excellent, thanks very much for letting us know. This fix will be included in the next release of kedro-viz, so when that is released you should just be able to pip install kedro-viz==4.3.2 and all will be good.

yetudada commented 2 years ago

It looks like this issue is resolved! Thanks for raising it @jw-cpnet and have an awesome day! 🚀