kedro-org / kedro-viz

Visualise your Kedro data and machine-learning pipelines and track your experiments.
https://demo.kedro.org
Apache License 2.0
682 stars 113 forks source link

Replace `watchgod` library with `watchfiles` #2134

Closed jitu5 closed 3 weeks ago

jitu5 commented 1 month ago

Description

Resolves https://github.com/kedro-org/kedro-viz/issues/1589 and https://github.com/kedro-org/kedro-viz/issues/2135

This pull request includes

Replacing the watchgod library with watchfiles.

watchfiles library updates its API with main method run_process and some other changes as well:

Please refer migration guild for more details.

Custom File Filter:

Development notes

Library Replacement and File Watching Mechanism Updates:

Dependency Updates:

Checklist

ravi-kumar-pilla commented 1 month ago

Hi @jitu5 ,

This may not be related to watchgod -> watchfiles migration. But I am seeing more error logs when I edit files i.e.,

  1. Start in autoreload mode - kedro viz -a
  2. Start editing .py or any file that is being watched
  3. Save the changes and below error (similar errors) appear before rebuilding the app (this was raised earlier as well)
  4. Even without editing the file, if you just save the file, i.e., you are on .py file and do cmd+s -> raises error in the terminal

If possible, create a new ticket to have debounce for watching files and rebuilding the app or handle these errors someway for a cleaner watch experience. Since these errors have become frequent now, I would suggest we think of a solution before releasing this migration. Thank you

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/__init__.py", line 16, in <module>
    from . import context
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/context.py", line 6, in <module>
    from . import reduction
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/reduction.py", line 15, in <module>
    import pickle
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/pickle.py", line 43, in <module>
    from _pickle import PickleBuffer
KeyboardInterrupt
jitu5 commented 1 month ago

Hi @jitu5 ,

This may not be related to watchgod -> watchfiles migration. But I am seeing more error logs when I edit files i.e.,

  1. Start in autoreload mode - kedro viz -a
  2. Start editing .py or any file that is being watched
  3. Save the changes and below error (similar errors) appear before rebuilding the app (this was raised earlier as well)
  4. Even without editing the file, if you just save the file, i.e., you are on .py file and do cmd+s -> raises error in the terminal

If possible, create a new ticket to have debounce for watching files and rebuilding the app or handle these errors someway for a cleaner watch experience. Since these errors have become frequent now, I would suggest we think of a solution before releasing this migration. Thank you

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/__init__.py", line 16, in <module>
    from . import context
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/context.py", line 6, in <module>
    from . import reduction
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/multiprocessing/reduction.py", line 15, in <module>
    import pickle
  File "/Users/Ravi_Kumar_Pilla/opt/anaconda3/envs/kedro-viz-py39/lib/python3.9/pickle.py", line 43, in <module>
    from _pickle import PickleBuffer
KeyboardInterrupt

@ravi-kumar-pilla I created a new virtual environment and followed the above steps to test but I didn't get any errors in logs. Could you try after crating new virtual environment. Also about Debouncing, run_process has debounce/step parameter, which we can use to introduce debounce for watcher.

Huongg commented 4 weeks ago

Hey @jitu5 after testing your PR, i got a similar error as @ravi-kumar-pilla has too

Screenshot 2024-10-28 at 10 21 37
jitu5 commented 3 weeks ago

Hey @jitu5 after testing your PR, i got a similar error as @ravi-kumar-pilla has too

Screenshot 2024-10-28 at 10 21 37

@Huongg The errors you shared are not the same as Ravi got, though they may be related to multiprocessing. Could you please share steps to reproduce the same error? Also have you created a new virtual environment to test ?

Huongg commented 3 weeks ago

hey @jitu5 I tried a different environment and the error is clear now, it could have the previous environment. But I know have a different error but i might not be related to your changes numpy.core.multiarray failed to import

rashidakanchwala commented 3 weeks ago

hey @jitu5 I tried a different environment and the error is clear now, it could have the previous environment. But I know have a different error but i might not be related to your changes numpy.core.multiarray failed to import

not sure maybe you do pip install numpy again

ravi-kumar-pilla commented 3 weeks ago

Hi @jitu5 , The filter looks good but I see there are many reload requests. I saw we had this even before but is there a way to debounce these calls in watchfiles or a better way to handle file watching ?

I tested adding a change_test.json file under data/ folder which is ignored by git. Based on what I understood, the filter class should ignore the file change, instead it reloads Viz. Could you please confirm the behavior. Thank you

jitu5 commented 3 weeks ago

Hi @jitu5 , The filter looks good but I see there are many reload requests. I saw we had this even before but is there a way to debounce these calls in watchfiles or a better way to handle file watching ?

I tested adding a change_test.json file under data/ folder which is ignored by git. Based on what I understood, the filter class should ignore the file change, instead it reloads Viz. Could you please confirm the behavior. Thank you

@ravi-kumar-pilla Yes change_test.json should be ignored. I tested the same scenario and it was reloading viz. But when I checked the .gitignore file of demo-project it has two line related to data folder.

# ignore everything in the following folders
data/**
# except their sub-folders
!data/**/

Because of the second line with except condition, pathspec is not interpreting .gitignore the same way as git does. But pathspec library has specialised class GitIgnoreSpec. Now I moved to GitIgnoreSpec and its ignoring change_test.json file under data/ folder correctly.