Running DataCleaner on Spark as a local file and S3

chmandrade commented 8 years ago

I am running the Datacleaner on Spark, but when I run with local file as result I get empty result files on the result directory. When I run with s3 as the output result I get a stack trace bellow:

Running command: /Users/henriqueandrade/Documents/App/spark/spark-1.6.2-bin-hadoop2.6/bin/spark-submit \ --class org.datacleaner.spark.Main \ --master local[1] \ DataCleaner-env-spark-5.1.3-SNAPSHOT-jar-with-dependencies.jar \ conf_local.xml \ vanilla-job.analysis.xml \ jobAbsolutePath.properties This is the stack trace:

Exception in thread "main" org.apache.metamodel.MetaModelException: java.io.IOException: Mkdirs failed to create /results (exists=false, cwd=file:/Users/henriqueandrade/Documents/App/spark/DataCleaner/engine/env/spark/target) at org.apache.metamodel.util.HdfsResource.wrapException(HdfsResource.java:278) at org.apache.metamodel.util.HdfsResource.write(HdfsResource.java:237)

Sample Files: https://gist.github.com/chmandrade/bf249c82a9ccf64bf4be61abd9bd2395

kaspersorensen commented 7 years ago

@chmandrade this is pretty old and looks like it’s fallen off the radar a bit. Do you still have this issue? If so, can you try with DataCleaner CE 5.3.0 ?

handrade commented 7 years ago

Let me try this on the new version. Please close the ticket and if I have any problem I will open a new one. Thanks for your support.

datacleaner / DataCleaner

Running DataCleaner on Spark as a local file and S3 #1486