georgia-tech-db / evadb

Database system for AI-powered apps
https://evadb.ai/docs
Apache License 2.0
2.64k stars 262 forks source link

Improve UDF Experience: Clearer Error Messages, New Pandas Column Definitions, and Enforce Input Column Arguments #1426

Open Jjx003 opened 12 months ago

Jjx003 commented 12 months ago

NewPandasDataFrame & PandasColumn

I created a new PandasDataFrame class aimed at replacing the original one.The new class abstracts away the different column definitions (shape, type, name, etc.) into single column objects instead of multiple arguments:

        output_signatures=[
            NewPandasDataFrame(
                columns=[
                    PandasColumn('class', type=NdArrayType.STR, shape=(None,), is_nullable=False),
                    PandasColumn('predicted', type=NdArrayType.STR, shape=(None,), is_nullable=False),
                ]
            )
        ]

See: https://github.com/georgia-tech-db/evadb/compare/staging...Jjx003:functions?expand=1#diff-f096e708655c43345f3edfb4fdea7cf27e27ab8f29fe3d09d8223537247f85a6R96

New Error Messages for improper decorators:

See: https://github.com/georgia-tech-db/evadb/compare/staging...Jjx003:functions?expand=1#diff-1e9fbe4cbda64173cb3dd304d324d51f0245dd5d38c4fade971cb7eff8d7d862R49

Enforce PandasColumn Input_Signature

I also made it so that before evaluating a function, EVA first checks whether or not the input_signature columns are present in the input data. If not, an error is raised. See: https://github.com/georgia-tech-db/evadb/compare/staging...Jjx003:functions?expand=1#diff-72d3ebebc4aa466ca53857a7827fb01f786d3725161c7826ad4df100966e9c69R134