project-codeflare / codeflare

Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.
https://codeflare.dev
Apache License 2.0
222 stars 36 forks source link

Error running notebook "RaySystemError: System error: buffer source array is read-only" #39

Closed KastanDay closed 2 years ago

KastanDay commented 2 years ago

Describe the bug I'm trying to run the example notebooks (in codeflare/notebooks), and came across this error. The error persisted thru attempts to restart my kernel, entire machine, and re-cloning the repo. Any help, or an explanation of the root cause, is much appreciated!

To Reproduce Steps to reproduce the behavior:

  1. Go to notebooks/plot_nca_classification.ipynb
  2. Run 2nd code block. It uses Ray and Codeflare.
  3. This line produces the error knn_pipeline = rt.select_pipeline(pipeline_fitted, pipeline_fitted.get_xyrefs(node_knn)[0])
  4. See error: RaySystemError: System error: buffer source array is read-only

Full stack trace:

RaySystemError: System error: buffer source array is read-only
traceback: Traceback (most recent call last):
  File "/home/kastan/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/serialization.py", line 268, in deserialize_objects
    obj = self._deserialize_object(data, metadata, object_ref)
  File "/home/kastan/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/serialization.py", line 191, in _deserialize_object
    return self._deserialize_msgpack_data(data, metadata_fields)
  File "/home/kastan/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/serialization.py", line 169, in _deserialize_msgpack_data
    python_objects = self._deserialize_pickle5_data(pickle5_data)
  File "/home/kastan/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/serialization.py", line 157, in _deserialize_pickle5_data
    obj = pickle.loads(in_band, buffers=buffers)
  File "sklearn/neighbors/_dist_metrics.pyx", line 223, in sklearn.neighbors._dist_metrics.DistanceMetric.__setstate__
  File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

---------------------------------------------------------------------------
RaySystemError                            Traceback (most recent call last)
/tmp/ipykernel_1251/3313313255.py in <module>
      9 test_input.add_xy_arg(node_scalar, dm.Xy(X_test, y_test))
     10 
---> 11 knn_pipeline = rt.select_pipeline(pipeline_fitted, pipeline_fitted.get_xyrefs(node_knn)[0])
     12 knn_score = ray.get(rt.execute_pipeline(knn_pipeline, ExecutionType.SCORE, test_input)
     13                     .get_xyrefs(node_knn)[0].get_yref())

~/.pyenv/versions/3.8.6/lib/python3.8/site-packages/codeflare/pipelines/Runtime.py in select_pipeline(pipeline_output, chosen_xyref)
    381         curr_xyref = xyref_queue.get()
    382         curr_node_state_ptr = curr_xyref.get_curr_node_state_ref()
--> 383         curr_node = ray.get(curr_node_state_ptr)
    384         prev_xyrefs = curr_xyref.get_prev_xyrefs()
    385 

~/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/_private/client_mode_hook.py in wrapper(*args, **kwargs)
     87             if func.__name__ != "init" or is_client_mode_enabled_by_default:
     88                 return getattr(ray, func.__name__)(*args, **kwargs)
---> 89         return func(*args, **kwargs)
     90 
     91     return wrapper

~/.pyenv/versions/3.8.6/lib/python3.8/site-packages/ray/worker.py in get(object_refs, timeout)
   1621                     raise value.as_instanceof_cause()
   1622                 else:
-> 1623                     raise value
   1624 
   1625         if is_individual_id:

Expected behavior Expected is selecting the pipeline and evaluating its score via a 'SCORE' pipeline.

Desktop

Thank you for any help! I am a University of Illinois at Urbana-Champaign grad student trying to make the most of your work!

raghukiran1224 commented 2 years ago

@yuanchi2807 @klwuibm - any ideas? Have you seen this before?

yuanchi2807 commented 2 years ago

HI @KastanDay, thanks for checking out CodeFlare. The error appears to be originated from sklearn and a bug in cython. What version of sklearn are you using? Can you try again with 0.24.+? Thanks.

KastanDay commented 2 years ago

I'm using the latest stable version of sklearn sklearn.__version__ is 1.0. Codeflare version codeflare-0.1.2.dev0 from pip3 install .

Still unsuccessful after a few more attempts:

Same error :/

I wouldn't mind except I get the same error when trying to build my own more sophisticated pipeline as part of my research!

Edit: Of course, the docker container works no problem. Yet, I'd ideally like to contribute to your project, so I need it running locally :)

Thanks, would be happy to work with you to find a solution. Best, Kastan

yuanchi2807 commented 2 years ago

Hi @KastanDay, thank you for confirming sklearn version to be 1.0. I am also getting the "ValueError: buffer source array is read-only" in my environment after advancing to 1.0. However, after downgrading scikit-learn to 0.24.1, the notebook ran as expected with no exception thrown. It appears to me that sklearn 1.0 broke ray. I am searching ray forum to see if others have reported this issue. For reference, I am running python 3.8.8 and ray 2.0.0.dev0. sklearn 1.0 => read-only exception sklearn 0.24.1 => pass CC: @raghukiran1224 @klwuibm

yuanchi2807 commented 2 years ago

Hi @KastanDay, turned out this is a recurring bug in pandas. It was reported and investigated by ray developers. See this comment There are still multiple bug reports on its regression in later Pandas releases See

I have tried multiple combinations of pandas and scikit-learn but the only version that works is sklearn 0.24.1. Do you have a strong dependency on sklearn 1.0?

KastanDay commented 2 years ago

Hi @yuanchi2807, thank you for the detailed investigation with interesting results. I have no dependency to sklearn 1.0, so I will downgrade as you suggest.