Closed pascalwhoop closed 2 weeks ago
Thanks, please do share telemetry for it.
@ravi-kumar-pilla
@rashidakanchwala can you point me at a doc how to do that?
Hi @pascalwhoop , thanks for raising this. Could you please let us know -
kedro viz
, kedro viz run
or any other ?kedro viz -a
?%run_viz
in jupyter notebook ?Thank you
From @pascalwhoop top comment, I see the command seems to be kedro viz --autoreload
:
\-+= 71535 pascalwhoop /opt/homebrew/Cellar/python@3.11/3.11.9_1/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python /Users/pascalwhoop/Code/everycure/matrix/pipelines/matrix/.venv/bin/kedro viz --aut
@rashidakanchwala can you point me at a doc how to do that?
Thanks, I just realised telemetry doesn't share that information. For Kedro-viz; it only shares on the way you interact with the UI. We have done some refactoring around autoreload recently. Once we release that maybe we might observe some changes to this - https://github.com/kedro-org/kedro-viz/pull/2134
Is it normal that multiprocessing.spawn.spawn_main
is called twice @rashidakanchwala ?
I re-set up my mailing notifications and will hopefully be more responsive for OSS contributions going forward. If not, ping me on slack :)
kedro viz --autoreload
indeed was used to kick off.
Is it normal that
multiprocessing.spawn.spawn_main
is called twice @rashidakanchwala ?
this is not kedro-viz code, but watchgod
library that we call when we do --autoreload
@jitu5 and @ravi-kumar-pilla , can we test if replacing watchgod helps this ?
Then @pascalwhoop any chance you can try if #2134 fixes the issue?
For context on what we do for --autoreload
: We use multiprocessing to run the uvicorn server in its own process (recommended) . For --autoreload
we use watchgod
which starts a subprocess for file watching. So in a way we have 2 processes running. We might need to change this approach.
we do have a lot of files in the folder, are you sensibly excluding files? E.g. only watch those that are managed by git (vs. all of .venv
and data
?
Good question, you are right we watch all files where file_path.endswith((".yml", ".yaml", ".py", ".json")) -- maybe we need to add better logic to this.
Good question, you are right we watch all files where file_path.endswith((".yml", ".yaml", ".py", ".json")) -- maybe we need to add better logic to this.
Agree. We should also ignore hidden files and files not tracked by version control. @jitu5
Good question, you are right we watch all files where file_path.endswith((".yml", ".yaml", ".py", ".json")) -- maybe we need to add better logic to this.
Agree. We should also ignore hidden files and files not tracked by version control. @jitu5
Sure we can add this changes in https://github.com/kedro-org/kedro-viz/pull/2134
Right now we are using multiprocessing.Process
but in watchfiles
documentation suggest to use multiprocessing.get_context('spawn').Process
to avoid forking and improve code reload/import. I will add this changes as well in https://github.com/kedro-org/kedro-viz/pull/2134
Right now we are using
multiprocessing.Process
but inwatchfiles
documentation suggest to usemultiprocessing.get_context('spawn').Process
to avoid forking and improve code reload/import. I will add this changes as well in #2134
Cool... we did that for notebook launcher while resolving an issue - https://github.com/kedro-org/kedro-viz/blob/main/package/kedro_viz/launchers/jupyter.py#L149
appreciate the active work folks that should do the trick
$ find ./ | egrep "(yml|yaml|py|json)\$" | wc -l
119393
(node we use joblib
heavily to cache API calls which leads to such large number of json files)
The fix for this issue has been merged. As a follow-up, we’ll remind @pascalwhoop to check if it’s resolved in the next release of Kedro-Viz. Closing this for now.
Description
Regularly observing 1 python process being pinned at 100% CPU utilization. Turns out it's kedro viz
Steps to Reproduce
it may be related to our kedro pipeline, it's dynamic and has 180+ nodes. Happy to share some telemetry if you guide me how. I can't share the code (yet) unfortunately.
Your Environment
Include as many relevant details as possible about the environment you experienced the bug in:
Checklist