-
As a follow-up to #3155, i tried inside spark_apply closure:
persisted_df
-
I'm writing ammonite scripts that query a Postgres database, then do analysis on them using Spark DataFrames. I would like to use Quill queries for both databases, but I'm finding it difficult to figu…
-
i was trying to use one of the explain_document_ml pretrained model for text analytics, it was working fine on smaller dataset but when i tried using it on large dataset, out of memory error started t…
-
## Environment data
- VS Code version: 1.57.1
- OS and version: macOS big sur version 11.4
- Python version (& distribution if applicable, e.g. Anaconda): python 3.7.11
- Type of virtua…
-
I recently tried installing spark-alchemy using spark 3.0 using the following:
`spark-shell --repositories https://dl.bintray.com/swoop-inc/maven/ --packages com.swoop:spark-alchemy_2.12:1.0.0`
…
-
When converting a DataFrame/Series of type `object` (i.e. Strings) with `np.nan` values to Koalas DataFrames and back, the former `np.nan` values are replaced with `None` as can be seen below:
```p…
-
**Is your feature request related to a problem? Please describe.**
Related to #216 : SDL could create tables if they don't exist.
**Describe the solution you'd like**
SDL could generate tables on…
-
---
The calculation of `dplyr::n_distinct` applied to a spark data frame is different from the result when it is applied locally. The `na.rm = TRUE` argument is ignored when applied to a spark data…
-
Sort merge join can be faster than hash join when Series are sorted and maybe when they are not.
-