-
I'm trying to run a simple example using a pre-trained pipeline from the Spark NLP library. I get an error when I'm downloading the pipeline:
### code
```
import os
from pyspark.sql import Spa…
-
Received the following error using the default installation of Hudi in EMR 5.29.0 (Hudi version 5.0.0):
`RetryInvocationHandler: Exception while invoking ConsistencyCheckerS3FileSystem.open over null…
-
* [x] master
* [x] 4-2-stable
-----
I have noticed with Amazon S3 that every so often on an upload the HEAD request will return a not found HTTP response. If something is done in iRODS on th…
-
```
Py4JJavaErrorTraceback (most recent call last)
/usr/local/share/jupyter/kernel-launchers/python/scripts/launch_ipykernel.py in ()
----> 1 df = spark.read.format("csv").option("header", "true").…
-
Hello.
I have Presto EMR 5.29.0 cluster.
I was able to create hive schema like this:
`CREATE SCHEMA hive.bigdata WITH (location = 's3://vogo-big-data-lake/');`
and also able to create one table…
-
**Describe the problem you faced**
Given a savepoint, rollback fails with no error messages.
**To Reproduce**
Environment: EMR 6.4.0, S3
Steps to reproduce the behavior:
1. create Hudi …
-
The Delta Lake failing with the following error. The EMR retires the step when it fails so we are seeing duplicate records in the target. will the delta lake rollbacks on this type of error?
```
T…
kkr78 updated
2 years ago
-
**Describe the problem you faced**
My MOR Hudi Table is on S3.
I would need to query the table frequently ( say once every minute ) from spark sql.
I use https://hudi.apache.org/docs/querying_dat…
-
S3 provides multiple option to encrypt data at rest:
Server-Side Encryption (SSE): request S3 servers to encrypt objects before saving them on disks and then decrypt it when downloading objects.
Cli…
-
Hello,
My organization runs an EMR cluster in one Amazon account, but we save data to an S3 bucket (through EMRFS) in another account. In order to accomplish this, we configure the SparkSession as …
ghost updated
3 years ago