Open yuanchi2807 opened 9 months ago
Cross posting from https://github.com/opendatahub-io/data-science-pipelines/issues/179
A prototype following the above solution design can be found at this link.
https://github.com/yuanchi2807/dsp_codeflare_int_testing
Ray application image can be pulled from quay.io/yuanchichang_ibm/integration_testing/dsp_codeflare_int_testing:0.1
The pipeline definition yet_another_ray_integration_test.py is modified from https://github.com/diegolovison/ods-ci/blob/ray_integration/ods_ci/tests/Resources/Files/pipeline-samples/ray_integration.py to point to the custom image and invokes docker_clustering_driver.py through Ray jobs API.
Please feel free to comment.
fyi @sutaakar
On the first look it looks fine to me. I will try to run it this week. Waiting for feedback from Diego, as he has more experience with Pipelines.
On the first look it looks fine to me. I will try to run it this week. Waiting for feedback from Diego, as he has more experience with Pipelines.
My prototype is to test the water and can be enhanced to lengthen the pipeline.
Name of Feature or Improvement
Create an integration test case to validate DSP, CodeFlare and KubeRay implementation.
Describe the Solution You Would Like to See
Test environment assumptions:
Proposed test case: Clustering text documents using k-means on scikit-learn education page.
https://scikit-learn.org/stable/auto_examples/text/plot_document_clustering.html
Data Science Pipeline stages:
Expected test assets: