-
## Problem
When handling writing Spark dataframes to datalake storage, the order of the columns in the dataframe is important. For example if a pipeline is appending parquet files in the lake, if t…
-
what might be useful http://stackoverflow.com/questions/41427191/dataframe-into-dense-vector-spark
It utilizes concepts from below coursera videos
### Notes
instantiate sparksession.build(). et…
-
subtask #47
-
The Named RDD feature is one of the reasons spark job server is so great, sharing RDDs between queries is what makes interactive multiuser querying possible.
While it is possible to represent GraphX…
-
I am using this API to query from a large Mongo DB collection. Is there any way I can specify query filters to load selected documents as dataframe and not the whole collection. Probably some kind of …
-
As a Chapel Programmer, I want to be able to use a DataFrame (similar in basic functionality like Pandas for python) in my Chapel application so that I can manipulate my data easily with Chapel.
Acce…
-
subtask #47 DatasetHolder.js
-
**Is your feature request related to a problem? Please describe.**
We could probably make finding answers to common questions a lot easier. Each should have a 2-sentence TL;DR with pointers to more d…
-
I would be nice to get some insight from data ETL, dataframe comes with some built statistics, but mllib provides more.
It would be good to access through a ELK stack
-
Hello,
Sorry for the silly question, but I can't seem to find a way to use the %%spark magic inside functions and/or loops. The use case would be like the following: I have a series of spark datafr…