pentaho-labs / pentaho-cpython-plugin

This is a PDI plugin that allows execution of Python code.
Apache License 2.0
32 stars 19 forks source link

CPython step is just not working #26

Open manuhortet opened 5 years ago

manuhortet commented 5 years ago

I'm facing what seems to be a common problem. It has been repeatedly asked on the pentaho forums and stackoverflow and it doesn't seem to have any known solution.

I will just get something like this when a CPython Script Executor is on a transformation:

2018/09/27 17:19:13 - csr_matrix.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Unexpected error
2018/09/27 17:19:13 - csr_matrix.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : java.lang.NullPointerException
2018/09/27 17:19:13 - csr_matrix.0 -    at org.pentaho.python.PythonSession.rowsToPythonDataFrame(PythonSession.java:389)
2018/09/27 17:19:13 - csr_matrix.0 -    at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.rowsToPyDataFrame(CPythonScriptExecutor.java:458)
2018/09/27 17:19:13 - csr_matrix.0 -    at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processBatch(CPythonScriptExecutor.java:276)
2018/09/27 17:19:13 - csr_matrix.0 -    at org.pentaho.di.trans.steps.cpythonscriptexecutor.CPythonScriptExecutor.processRow(CPythonScriptExecutor.java:243)
2018/09/27 17:19:13 - csr_matrix.0 -    at org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2018/09/27 17:19:13 - csr_matrix.0 -    at java.lang.Thread.run(Thread.java:748)

Some examples of other people having this very same problem: https://stackoverflow.com/questions/38574457/pentaho-pdi-metadata-related-null-pointer-exception-in-scripting-task https://forums.pentaho.com/archive/index.php/t-227211.html

I've tried it on different environments and it persists. (Python 2.7, Python 3, Python 3.6... Installing the dependencies manually, using Anaconda...)

tombarti commented 5 years ago

I'm running into a similar error. It seems to happen when the CPython step does not receive any rows from the previous step.

You could avoid this by using the Detect empty stream step as shown in the example on that link. However this is quite inelegant and cumbersome.