-
hi,
We are managing offsets externally in HBase. We haven't enabled spark checkpointing.
So everytime we want to start job, we are providing offsets, which are correct when cross checked with Kafk…
-
I would like to propose implementation of our project in following technologies:
Akka,
Kafka,
Spark,
Spark Streaming,
Cassandra
Mentioned stack, which seems to be very popular recently, was presented…
-
## data-flow
source system - source connector - kafka - sink connector - target system
![image](https://user-images.githubusercontent.com/21363169/87251934-463b5780-c4aa-11ea-8103-fd30df69a19f.png)
…
-
I try to use spark to read kafka and write hudi table to minio, the code runs normally, then I try to use concurrency according to the demo to improve the speed, but ClassNotFound, and the program is …
-
**Describe the problem you faced**
We have hudi job running nightly and processing around 10 million per run. It usually runs fine for a few days but we suddenly encounter timeout issues around 5th…
-
Hudi 0.14 used for storage in ceph cluster with S3 connection and TLS authentication.
For spark authentification on S3 there are following configuration parameters used: "spark.hadoop.fs.s3a.endpoi…
-
Hudi 0.14 used for storage in ceph cluster with S3 connection and TLS authentication.
For spark authentification on S3 there are following configuration parameters used: "spark.hadoop.fs.s3a.endpoint…
-
Hudi 0.14 used for storage in ceph cluster with S3 connection and TLS authentication.
For spark authentification on S3 there are following configuration parameters used: "spark.hadoop.fs.s3a.endpoint…
-
#ENV:
kylin 3.1.3
hadoop 2.5.0
hive 2.3.9
# ECHO INFO:
```# /***/apache-kylin/bin/kylin.sh start
Using hadoop conf cached dependency...
...................................................[PASS…
-
It was well for me when i use it in pyspark
```
Python 2.7.5 (default, Apr 11 2018, 07:36:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux2
Type "help", "copyright", "credits" or "license" fo…