-
**Describe the problem you faced**
I am a new user to Hudi and Parquet, and I have a table design question. I have structured my table in the following way:
```{
"name": "foobar",
"metadata" : {
"ke…
-
Create a vocabulary of data-related terminology - a glossary - for use by end users of IDN systems & methods and partners.
The glossary may need to be presented in various physical/digital forms, may…
-
### Feature Request
Allow backup/restore or migration of event logs and session recordings between different storage backends.
### Motivation
Today, it's possible to migrate configuration (us…
-
Create a narrow/big table:
```
CREATE TABLE event_records AS (
SELECT
timestamp_sequence(to_timestamp('2023-05-15T22:00:00.123456Z', 'yyyy-MM-ddTHH:mm:ss.SSSUUUz'), 10000L) ts,
rnd_…
-
It's currently not possible to set properties like `casesensitivity` or `normalization` on a dataset after its creation; to set these, you should create a new dataset w/ these properties changed, sync…
-
Introduction
============
Datacube now supports network resources, particularly Cloud Optimized GeoTIFFs residing on S3 or other HTTP based storage systems. Network resources might experience inte…
-
### What happened + What you expected to happen
I tried to test Ray Data for parallel processing of binary data that I store in my S3 bucket. There are lots of files I want to process (pdfs, images…
-
- Follow-up to #7860
Spans currently live forever on the scheduler, and with them the TaskGroups they aggregate.
This meta-data outlasts the Tasks, which are instead dereferenced as soon as they…
-
It is possible to define the same field multiple times for the same data_stream, in system package there are lots of duplications,
example: https://github.com/elastic/integrations/pull/6118#discussio…
-
We follow the this tutorial on https://ondata.blog/articles/getting-started-apache-spark-pyspark-and-jupyter-in-a-docker-container/. It offers a short guide till the first query.
It relies on a conta…