-
In order to get release 1.6 of parquet-mr out, we have to do some updates.
For Apache policy:
1. -Update the header in all Parquet source files (see PARQUET-72 for examples)-
2. -Add DISCLAIMER and K…
-
Hi there! First of all thank you for the tooling, it's incredibly powerful. I have been using `json2parquet` to process some intricate `.jsonl` files. I have had a good time with small sizes but not f…
-
### Description
This issue starts a discussion about enabling:
- linter: [revive](https://revive.run/) - Fast, configurable, extensible, flexible, and beautiful linter for Go. Drop-in replacement …
-
Following the [instructions on launching Databricks](https://docs.rapids.ai/deployment/stable/platforms/databricks/) have started failing.
On step 5 of section 2 where it says "Create and launch yo…
-
How to reproduce:
```sh
git clone https://github.com/apache/spark.git && cd spark
git fetch origin pull/26804/head:PARQUET-1746
git checkout PARQUET-1746
build/sbt "sql/test-only *StreamSuite"…
-
# TLDR
* Keep `datafusion-cli` in the apache/datafusion repo
* Make a new repo with a new CLI called `dfdb` (or `datafusion-cli++`or `dfcli`) which is purposely designed for running queries against …
alamb updated
2 weeks ago
-
Hi, the following code in cmd `pip install git+https://github.nrel.gov/NREL/ditto.git@master` is returning the attached error.
Cloning https://github.nrel.gov/NREL/ditto.git (to master) to c:\use…
-
The problematic Avro and Thrift schemas are:
```
record AvroArrayOfArray {
array int_arrays_column;
}
```
and
```
struct ThriftListOfList {
1: list intArraysColumn;
}
```
They are converted to the…
-
**Describe the bug**
cuDF DataFrames indexed by a Timestamp range can be accessed using `.loc[]` without any problem. However, if the cuDF DataFrame is indexed with a MultiIndex with timestamps as th…
-
I'm running into an error when trying to write the following test `GeoDataFrame` into a `.gpkg` file:
```python
import geopandas
db = geopandas.read_file('test.geojson')
db.to_file('test.gpkg', …