-
### Describe the bug
Hi. I'm facing issue using PySpark+JDBC driver, when columns have spaces, like "Test Column". E.g. when trying to filter table using pure jdbc driver **or** spark-clickhouse-co…
paf91 updated
4 months ago
-
Task is to perform the harmonisation on GWAS Catalog summary statistics synched from the EBI FTP.
The full size of the dataset (on 2024-10-14) was:
```
❯ gsutil ls 'gs://gwas_catalog_inputs/raw_summa…
-
`attach_distributed_sequence_column` 를 사용하지 않는 경우: 정상 동작
```python
>>> psidx = ps.Index([1, 2, 3, 4, 5])
>>>
>>> scol_name = psidx._internal.index_spark_column_names[0]
>>>
>>> sdf = psidx._in…
-
### Is your feature request related to a problem or challenge?
We're working on running some used-to-be-Spark pipelines through DataFusion. One case we've noticed where DataFusion doesn't support som…
-
When running a comparison on dataframes with a single column, the following exception is thrown:
```
/opt/venv/lib/python3.8/site-packages/datacompy/spark.py:356: in rows_both_mismatch
self._…
-
### Describe the bug
According to the [Substrait specification](https://substrait.io/relations/logical_relations/#project-operation) project relations emit all if the input fields followed by the l…
-
**Describe the bug**
Error thrown for dbt_columns, dbt_models and _dbt_sources table creation during first dbt run after elementary is added to the dbt project
03:53:00 Completed with 3 errors an…
-
## Bug
Zordering on String columns does not seem to work. The error does help much either.
### Describe the problem
In my dataset, I have 2 columns that I use for my predicate for querying data a…
-
I am using `dbplyr` via `sparklyr`. I would like to employ Spark's POSEXPLODE operator, but I can't identify a way to invoke it inside `mutate` or `summarize` because it returns two columns and dplyr'…
-
## Bug
#### Which Delta project/connector is this regarding?
- [x] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)
### Describe the problem
I'm using a Jup…
stvno updated
5 months ago