petabyte Search Results

1000+ results
for petabyte

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

numberlabs-developers/hudi #228

[SUPPORT] New User Table Design Query

**Describe the problem you faced** I am a new user to Hudi and Parquet, and I have a table design question. I have structured my table in the following way: ```{ "name": "foobar", "metadata" : { "ke…

torvalds-dev-testbot[bot] updated 2 months ago
6
idn-au/vocab-data #26

Create a data terminology glossary vocab

Create a vocabulary of data-related terminology - a glossary - for use by end users of IDN systems & methods and partners. The glossary may need to be presented in various physical/digital forms, may…

Metaduck updated 1 week ago
4
gravitational/teleport #3715

Backup/restore or migration of event logs and session record…

### Feature Request Allow backup/restore or migration of event logs and session recordings between different storage backends. ### Motivation Today, it's possible to migrate configuration (us…

awly updated 1 year ago
2
questdb/questdb #3674

Optimise `select distinct col from table` so that it uses on…

Create a narrow/big table: ``` CREATE TABLE event_records AS ( SELECT timestamp_sequence(to_timestamp('2023-05-15T22:00:00.123456Z', 'yyyy-MM-ddTHH:mm:ss.SSSUUUz'), 10000L) ts, rnd_…

marregui updated 1 year ago
8
openzfs/zfs #13080

Allow overriding some init-only properties if the dataset do…

It's currently not possible to set properties like `casesensitivity` or `normalization` on a dataset after its creation; to set these, you should create a new dataset w/ these properties changed, sync…

mqudsi updated 2 years ago
3
opendatacube/datacube-core #802

Support retries when reading from network resources

Introduction ============ Datacube now supports network resources, particularly Cloud Optimized GeoTIFFs residing on S3 or other HTTP based storage systems. Network resources might experience inte…

Kirill888 updated 1 week ago
4
ray-project/ray #44215

[Ray Data] read_binary_files does not load data from S3 in p…

### What happened + What you expected to happen I tried to test Ray Data for parallel processing of binary data that I store in my S3 bucket. There are lots of files I want to process (pdfs, images…

DevKretov updated 6 months ago
2
dask/distributed #7977

Spans lifetime management

- Follow-up to #7860 Spans currently live forever on the scheduler, and with them the TaskGroups they aggregate. This meta-data outlasts the Tasks, which are instead dereferenced as soon as they…

crusaderky updated 1 year ago
1
elastic/integrations #6412

[System] remove duplicated fields

It is possible to define the same field multiple times for the same data_stream, in system package there are lots of duplications, example: https://github.com/elastic/integrations/pull/6118#discussio…

tetianakravchenko updated 1 month ago
5
siradam/DataMining_Project #17

Optimisation: use Big Data architecture to replace relationa…

We follow the this tutorial on https://ondata.blog/articles/getting-started-apache-spark-pyspark-and-jupyter-in-a-docker-container/. It offers a short guide till the first query. It relies on a conta…

lorenzznerol updated 3 years ago
4

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for petabyte

1000+ results
for petabyte