j3-signalroom / apache_flink-kickstarter

Examples of Apache Flink® applications showcasing the DataStream API, Table API in Java and Python, and Flink SQL, featuring AWS, GitHub, Terraform, Streamlit, and Apache Iceberg.
https://linkedin.com/in/jeffreyjonathanjennings
MIT License
1 stars 0 forks source link

`TypeError: cannot pickle '_thread.lock' object`. #181

Closed j3-signalroom closed 2 months ago

j3-signalroom commented 2 months ago
Traceback (most recent call last):
  File "/opt/flink/python_apps/kickstarter/data_generator_app.py", line 193, in <module>
    DataGeneratorApp.main(sys.argv)
  File "/opt/flink/python_apps/kickstarter/data_generator_app.py", line 42, in main
    .map(KafkaClientPropertiesLookup(False, common_functions.get_app_options(args)))
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 312, in map
  File "/opt/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 656, in process
  File "/opt/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 2782, in _get_one_input_stream_operator
  File "/opt/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 2899, in _create_j_data_stream_python_function_info
  File "/opt/flink/opt/python/cloudpickle-2.2.0-src.zip/cloudpickle/cloudpickle_fast.py", line 73, in dumps
  File "/opt/flink/opt/python/cloudpickle-2.2.0-src.zip/cloudpickle/cloudpickle_fast.py", line 632, in dump
TypeError: cannot pickle '_thread.lock' object
org.apache.flink.client.program.ProgramAbortException: java.lang.RuntimeException: Python process exits with code: 1
        at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:134)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:356)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:223)
        at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:113)
        at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:1026)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:247)
        at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1270)
        at org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$10(CliFrontend.java:1367)
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
        at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1367)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1335)
Caused by: java.lang.RuntimeException: Python process exits with code: 1
        at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:124)
        ... 17 more
j3-signalroom commented 2 months ago

The error TypeError: cannot pickle '_thread.lock' object indicates that there is an attempt to serialize an object that contains a thread lock, which is not serializable by default. This is a common issue when using multiprocessing or threading in Python, especially in distributed computing frameworks like Apache Flink.

To resolve this issue, you need to ensure that the objects being passed to the map function do not contain thread locks or other non-serializable objects.