Closed data-storyteller closed 2 years ago
moving to https://issues.apache.org/jira/browse/HUDI-3262 for work tracking
@data-storyteller : I tested integ test bundle in latest master and is all good. I have attached logs in the linked jira. is your env spark2 or spark3?
@data-storyteller @nsivabalan i see this param --packages org.apache.spark:spark-avro_2.11:2.4.0
while the spark version mentioned is 2.4.7. @data-storyteller can you align these versions? should be both 2.4.7
I could able to reproduce w/ latest master. will investigate further.
Thanks @nsivabalan for fix. This issue is resolved now.
Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
A clear and concise description of the problem.
To Reproduce
Steps to reproduce the behavior:
1. 2. 3. 4.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version : latest (master)
Spark version : 2.4.7
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) :
Running on Docker? (yes/no) :
Yes
Additional context Running the integ test on docker setup. The tests are failing with following stacktrace. Command -
docker exec -i adhoc-2 /bin/bash spark-submit --packages org.apache.spark:spark-avro_2.11:2.4.0 --conf spark.task.cpus=1 --conf spark.executor.cores=1 --conf spark.task.maxFailures=100 --conf spark.memory.fraction=0.4 --conf spark.rdd.compress=true --conf spark.kryoserializer.buffer.max=2000m --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.memory.storageFraction=0.1 --conf spark.shuffle.service.enabled=true --conf spark.sql.hive.convertMetastoreParquet=false --conf spark.driver.maxResultSize=12g --conf spark.executor.heartbeatInterval=120s --conf spark.network.timeout=600s --conf spark.yarn.max.executor.failures=10 --conf spark.sql.catalogImplementation=hive --conf spark.driver.extraClassPath=/var/demo/jars/* --conf spark.executor.extraClassPath=/var/demo/jars/* --class org.apache.hudi.integ.testsuite.HoodieTestSuiteJob /opt/$HUDI_JAR_NAME --source-ordering-field test_suite_source_ordering_field --target-base-path /user/hive/warehouse/hudi-integ-test-suite/output --input-base-path /user/hive/warehouse/hudi-integ-test-suite/input --target-table table1 --props file:/var/hoodie/ws/docker/demo/config/test-suite/$PROP_FILE --schemaprovider-class org.apache.hudi.integ.testsuite.schema.TestSuiteFileBasedSchemaProvider --source-class org.apache.hudi.utilities.sources.AvroDFSSource --input-file-size 125829120 --workload-yaml-path file:/var/hoodie/ws/docker/demo/config/test-suite/$YAML_NAME --workload-generator-classname org.apache.hudi.integ.testsuite.dag.WorkflowDagGenerator --table-type $TABLE_TYPE --compact-scheduling-minshare 1 $EXTRA_SPARK_ARGS --clean-input --clean-output
Stacktrace