StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.23k stars 1.83k forks source link

Support Python UDF #46248

Open stdpain opened 6 months ago

stdpain commented 6 months ago

Feature request

support Python UDF for starrocks

grammar: extends the create function to support inline call

CREATE FUNCTION echo(int) 
RETURNS int 
properties(     
"symbol" = "add", 
"type" = "Python",     
"file" = "inline" ,
"input" = "arrow"
)  
AS
$$
def echo(a):
    return a;
$$;

UDFCall: use grpc to call Python UDF Be process

Python Env Support: Since 3.8+ Arrow 16.0.0+ grpc

pip install pyarrow
pip install grpcio

make sure all package are install in PYTHONHOME (config in BE)

fe config: enable_udf=true be config:

python_envs=/root/python38

examples:

see test/sql/test_udf/test_python_udf

Additional context

PR Links

dirtysalt commented 6 months ago

linked to this issue

Python UDF support · Issue #45843 · StarRocks/starrocks https://github.com/StarRocks/starrocks/issues/45843

WencongLiu commented 2 months ago

@stdpain Hi, recently, I have been researching how to implement Python UDAF and UDTF. Can we add a WeChat conversation? Also, do you have any specific design documentation for your Python UDF? My WeChat is Liuwenclever.😆

stdpain commented 2 months ago

@stdpain Hi, recently, I have been researching how to implement Python UDAF and UDTF. Can we add a WeChat conversation? Also, do you have any specific design documentation for your Python UDF? My WeChat is Liuwenclever.😆

We could open a public discussion in slack. https://starrocks.slack.com/archives/C02FAD0JSSD

zhangm365 commented 1 month ago

@stdpain Hi, I have heard the python UDF will be supported in version 3.4, so when is the expected time? Thanks a lot.