Closed eash11 closed 4 months ago
You can't pass around DataFrames like this. Have a look at how to handle artifacts https://www.kubeflow.org/docs/components/pipelines/v2/data-types/artifacts/
/close
As @geier commented, this is not the correct way to pass pandas DataFrames in kfp.
@rimolive: Closing this issue.
Python 3.9.13
Steps to reproduce
I have created the following functions and components to run a simple pretrained model for text summarization. Input data is a CSV file with rows of text data for which I wanted to summarize for each one of them.
Step 1 :
Step 2 :
I still have the following steps to create the functions for
I am getting the following error while running the code for the function given above #_preprocesstext() itself in STEP 2.
Following is the error log:
TypeError Traceback (most recent call last) c:\Users\eashwar_n\kubeflow_experiment.ipynb Cell 3 line 1 4 from pandas import DataFrame 6 model = "meta-llama/Llama-2-7b-chat-hf" 9 @kfp.dsl.component ---> 10 def preprocess_text(df: DataFrame): 11 # Tokenize the text data using AutoTokenizer 12 tokenizer = AutoTokenizer.from_pretrained(model) 13 df['encoded_text'] = df['text'].apply(lambda text: tokenizer.encode(text, max_length=512, truncation=True))
File d:\myenv\pythonProject1\venv\lib\site-packages\kfp\dsl\component_decorator.py:119, in component(func, base_image, target_image, packages_to_install, pip_index_urls, output_component_file, install_kfp_package, kfp_package_path) 108 if func is None: 109 return functools.partial( 110 component, 111 base_image=base_image, (...) 116 install_kfp_package=install_kfp_package, 117 kfp_package_path=kfp_package_path) --> 119 return component_factory.create_component_from_func( 120 func, 121 base_image=base_image, 122 target_image=target_image, 123 packages_to_install=packages_to_install, 124 pip_index_urls=pip_index_urls, 125 output_component_file=output_component_file, 126 install_kfp_package=install_kfp_package, 127 kfp_package_path=kfp_package_path)
File d:\myenv\pythonProject1\venv\lib\site-packages\kfp\dsl\component_factory.py:556, in create_component_from_func(func, base_image, target_image, packages_to_install, pip_index_urls, output_component_file, install_kfp_package, kfp_package_path) 552 else: 553 command, args = _get_command_and_args_for_lightweight_component( 554 func=func) --> 556 component_spec = extract_component_interface(func) 557 component_spec.implementation = structures.Implementation( 558 container=structures.ContainerSpecImplementation( 559 image=component_image, 560 command=packages_to_install_command + command, 561 args=args, 562 )) 564 module_path = pathlib.Path(inspect.getsourcefile(func))
File d:\myenv\pythonProject1\venv\lib\site-packages\kfp\dsl\component_factory.py:422, in extract_component_interface(func, containerized, description, name) 419 return None 421 signature = inspect.signature(func) --> 422 name_to_input_spec, name_to_output_spec = get_name_to_specs( 423 signature, containerized) 424 original_docstring = inspect.getdoc(func) 425 parsed_docstring = docstring_parser.parse(original_docstring)
File d:\myenv\pythonProject1\venv\lib\site-packages\kfp\dsl\component_factory.py:271, in get_name_to_specs(signature, containerized) 265 name_to_output_specs[maybe_make_unique( 266 name, 267 list(name_to_output_specs))] = make_output_spec(annotation) 269 # parameter type 270 else: --> 271 type_string = type_utils._annotation_to_type_struct(annotation) 272 name_to_input_specs[maybe_make_unique( 273 name, list(name_to_input_specs))] = make_input_spec( 274 type_string, func_param) 276 ### handle return annotations ###
File d:\myenv\pythonProject1\venv\lib\site-packages\kfp\dsl\types\type_utils.py:556, in _annotation_to_type_struct(annotation) 554 return None 555 if hasattr(annotation, 'to_dict'): --> 556 annotation = annotation.to_dict() 557 if isinstance(annotation, dict): 558 return annotation
TypeError: to_dict() missing 1 required positional argument: 'self'
Expected result
Final CSV must have the summarized text for each of the text records present in the CSV in a new column
Materials and reference
Labels
Impacted by this bug? Give it a 👍.