uncleguanghui / pyflink_learn

基于 PyFlink 的学习文档,通过一个个小实践,便于大家快速入手 PyFlink
269 stars 97 forks source link

pyflink 关于任务执行的问题 #22

Closed wuyazibest closed 2 years ago

wuyazibest commented 2 years ago

pyflink任务的py脚本启动执行方式很少有文档提及,自己摸索感觉到不太对,想请大佬们帮忙看看,在此谢过

环境: flink==1.15 python==3.7 apache-flink==1.15

问题1:使用flink run执行py脚本时无法指定python环境 官网执行py任务命令:./bin/flink run --python examples/python/table/word_count.py 但是python项目都是有自己的virtualenv,看官方文档没有相关指定python环境的参数,是pyflink不支持吗?

问题2:直接使用python来执行py脚本可以成功,并看到输出,但是这种方式可行吗? 直接使用虚拟环境的python来执行py脚本可以成功执行,但是在这个过程中没有任何地方去指定flink,在flink中也没有看到任务的信息,感觉是这个py任务脱离了flink

问题3:windows环境执行py脚本,在创建环境时就报错,但是没有看到相关讨论的文章 步骤:直接使用python来运行py脚本

Traceback (most recent call last):
  File "F:/PyDev/exercise/DB/pyflinker.py", line 14, in <module>
    stream_execution_environment=StreamExecutionEnvironment.get_execution_environment(),
  File "D:\.env\py37_data\lib\site-packages\pyflink\datastream\stream_execution_environment.py", line 802, in get_execution_environment
    gateway = get_gateway()
  File "D:\.env\py37_data\lib\site-packages\pyflink\java_gateway.py", line 62, in get_gateway
    _gateway = launch_gateway()
  File "D:\.env\py37_data\lib\site-packages\pyflink\java_gateway.py", line 106, in launch_gateway
    p = launch_gateway_server_process(env, args)
  File "D:\.env\py37_data\lib\site-packages\pyflink\pyflink_gateway_server.py", line 326, in launch_gateway_server_process
    stdin=PIPE, preexec_fn=preexec_fn, env=env)
  File "C:\Program Files\Python37\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "C:\Program Files\Python37\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
wuyazibest commented 2 years ago

指定python环境的参数找到了 ( ̄▽ ̄*)

## 指定python环境
# ./bin/flink  run  --python examples/python/table/word_count.py  -pyexec ./venv/bin/python3.7

#  Python Table API  指定python环境
table_env.get_config().set_python_executable("./venv/bin/python3.7")

# #  Python DataStream API   指定python环境
# stream_execution_environment.set_python_executable("./venv/bin/python3.7")