-
### Apache Iceberg version
1.5.0
### Query engine
Spark
### Please describe the bug š
We're running Iceberg with Spark, using Spark Structured Streaming to read from a Kafka topic and write to anā¦
-
We need to develop a robust and scalable data ingest/ETL (Extract, Transform, Load) pipeline to facilitate the reading of eQTL (expression Quantitative Trait Loci) data from FTP sources, indexing it iā¦
-
We can start a java-style-go project for some useful API from java community that golang lacks, e.g. stream API,advanced data structures,concurrent utilities based on shared memory.
The name 'java-stā¦
-
### Feature Request Template
**Feature Title**
*Provide a concise title that summarizes the feature.*
---
### Description
**Provide a clear and concise description of the feature, includiā¦
-
Many users need concurrent data structures but not necessarily lock-free or wait-free ones (which are already available in EMBĀ²). For example, if an application is not timing critical, blocking data sā¦
-
### Description
Spinoff from the exciting discussion on https://github.com/apache/lucene/pull/13472:
Lucene has made great gains recently on intra-query concurrency: using multiple threads (with aā¦
-
## Development Task
The current pitr is pretty slow due to there are too many small files, also putting files via raft protocol make us suffer from write amplification.
If we can merge files as ā¦
-
-----
# Purpose
Parallelize the refresh pipeline to efficiently handle the download and analysis of large projects concurrently.
# Part 1 (Proof of Concept)
## Process
For a proof of concept (POCā¦
-
In a long-running program, things should not be able to grow indefinitely.
It would be nice if those data structures have a bounded counterpart.
I.e. you create the data structure with a maximum sizā¦
-
For example `concurrenthashset.go`, `concurrentpriorityqueue.go`?
- Those who want performance above everything or don't have a use case can very well use the existing data structures
- Those whoā¦