Unable to run examples - Githubissues

doraboughzela commented 3 years ago

Hello,

I am new to Spline. I was trying to run the example mentioned in the README file but I got this error:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.8:run (default) on project examples_2.11: An Ant BuildException has occured: The following error occurred while executing this line:
[ERROR] C:\Users\Dorra\Documents\spline-spark-agent\examples\run-example.xml:37: The following error occurred while executing this line:
[ERROR] C:\Users\Dorra\Documents\spline-spark-agent\examples\run-example.xml:98: Java returned: 1
[ERROR] around Ant part ...<ant antfile="C:\Users\Dorra\Documents\spline-spark-agent\examples/run-example.xml">... @ 5:87 in C:\Users\Dorra\Documents\spline-spark-agent\examples\target\antrun\build-main.xml

Any help please :) ! DB

wajda commented 3 years ago

There has to be more error messages in the logs, can you please post them here?

doraboughzela commented 3 years ago

Here is my log file. It was quite long for a comment 👍 log_spline.txt

wajda commented 3 years ago

Seems like it's a Hadoop misconfiguration issue. As you can see there are a bunch of the errors like this preceding the final error:

[java] Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): java.io.IOException: (null) entry in command string: null chmod 0644 C:\Users\Dorra\Documents\spline-spark-agent\examples\data\output\batchWithDependencies\beerConsCtl\_temporary\0\_temporary\attempt_20210325140404_0003_m_000000_3\part-00000-b1e8c9db-3492-42ec-b773-f3e1b1c9d1cc-c000.snappy.parquet

A quick googling gave me these links that might be helpful:

doraboughzela commented 3 years ago

Thank you for your return. Indeed, it was a Hadoop misconfiguration issue.

Now it works fine howerver I am not able to capture the lineage. Nothing appears on the Spline UI.

Actually, I used the docker deployement. Do I need to specifiy something when running this cmd?

mvn test -P examples -D exampleClass=za.co.absa.spline.example.batch.Example1Job

wajda commented 3 years ago

When there are any problems during capturing lineage they all should appear in logs. Please examine them closer. If you run examples using mvn test -P examples without any additional parameters then by default Spline agent is set to the BEST_EFFORT mode which literally means "try to capture lineage if you can, otherwise print a warning and carry on". It also assumes that Spline REST Gateway server is running on http://localhost:8080. If the connection isn't established then again, there has to be a message. You can change the Spline mode or the Spline producer endpoint using corresponding properties. It's described in readme

doraboughzela commented 3 years ago

Indeed, I modified the spline.mode to REQUIRED and I was able to capture the lineage. In the Spline UI, when i access detailed lineage. I am not able to get informations about operations/transformations. Actually, I get this error when I click on an operation icon:

Server returned code: 500. Http failure response for http://localhost:8080/consumer/operations/f35713f6-2d8b-4b90-aff7-dfa229b9ca1a:1

Thank you for your help.

doraboughzela commented 3 years ago

Please another question because along with examples, I wanted to run pyspark code. I followed the instruction by running like this :

  pyspark --packages za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.6 --conf "spark.sql.queryExecutionListeners=za.co.absa.spline.harvester.listener.SplineQueryExecutionListener" --conf "spark.spline.producer.url=http://localhost:9090/producer"

This is the code that I was trying to run with pyspark similar to the one in the python example:

>>> from pyspark import SparkContext
>>> sc = SparkContext
>>> sc._jvm.za.co.absa.spline.harvester.SparkLineageInitializer.enableLineageTracking(spark._jsparkSession)
21/03/26 01:40:39 WARN QueryExecutionEventHandlerFactory: Spline lineage tracking is also configured for codeless initialization. It wont be initialized by this code call to enableLineageTracking now.
JavaObject id=o35
>>> sc.setSystemProperty('spline.mode','REQUIRED')
>>> sc.setSystemProperty('spline.producer.url','http://localhost:9090/producer')
>>> sc.setSystemProperty('spline.arangodb.url','http://localhost:8529/spline')
>>> sc.setSystemProperty('spline.arangodb.name','spline')
>>> spark.read.option("header", "true").option("inferschema", "true").csv("data/input/wikidata.csv").write.mode('overwrite').csv("data/results/python-sample.csv")

The code executes with no error. However, no lineage is captured .

Any idea why ?

wajda commented 3 years ago

Regarding UI error, please take a look at the server response body. When 500 happens the server should respond with a unique error ID. Use that ID to find the corresponding server side error (in the Spline Server logs). There you should be able to find more details.

wajda commented 3 years ago

Regarding why the lineage is not captured. As I already explained above, there could be many reasons, and 99% of them should be reflected in logs. Use DEBUG, or even TRACE level to find even more details, but generally INFO level should give you sufficient amount of information.

wajda commented 3 years ago

>>> sc.setSystemProperty('spline.arangodb.url','http://localhost:8529/spline')
>>> sc.setSystemProperty('spline.arangodb.name','spline')

That is a complete non-sense. Where did you take it from?

doraboughzela commented 3 years ago

Hello again,

I was trying to find users' examples on the web. I think with older versions of spline, some examples used to have this. I can relate now that is not the case. I managed so far to cpature lineage and learn from my mistakes.

Thank you for the work you are doing. DB

wajda commented 3 years ago

I think with older versions of spline, some examples used to have this

Those properties never existed, it's purely a result of internet users' creativity :)

AbsaOSS / spline-spark-agent

Unable to run examples #190