-
-
**Describe the bug**
I have a data frame SRC that is backed by a Parquet file, and two data frames AGG1 and AGG2 that have the same set of values for a key column but different columns for the rema…
willb updated
3 years ago
-
E.g. with this code as a dummy example:
``` R
library(sparklyr)
library(dplyr)
library(ggplot2)
library(survival)
library(sparkapi)
library(nycflights13)
devtools::load_all(".")
flights_path %
in…
-
Hi everyone,
I am trying to create a new column that is the sum of different columns. This is possible using dplyr on an T data frame using rowSums(tbl[,n1:n2]).
When I tried it on a spark data …
-
Seems like a good idea to implement this as a convince helper method as part of the sparklyr sdf interface. Thanks to @nathaneastwood for suggesting it!
We need to be mindful of the following thoug…
-
Under Option 2: User-Defined Function the code is copied from the example above (Option 1: Explode and Collect). I believe it's missing a UDF that uses `map`
![IMG_5158](https://user-images.githubu…
-
Dear developpers,
I think I have spotted a bug in the partitioning of the dataframes done in PyRDF. Here is a simple snippet of example showing that :
config: (I use the most recent PyRDF version i…
-
Is there any reason why you did not go with [apache arrow](https://arrow.apache.org/) format from the beginning?
It would be at least nice, if you allowed `to_arrow_table` and `from_arrow_table` co…
-
Consider the scenario where we have the following schema
```
message Message {
string col_1 = 1;
string col_2 = 2;
}
```
Now, we create a spark DF with this schema and write to a File…
-
Good Afternoon!
I figured out much of my issues, which chiefly seemed to be the required use of VectorAssembler for my tabular data.
However, in following this guide: https://databricks.com/note…