-
**Is your feature request related to a problem? Please describe.**
Currently, Databricks is the only Spark supported adapter
**Describe the solution you'd like**
Add support for the [Apache Spark…
-
@foxish
I was running a large scale Spark TeraSort job against HDFS. The input size was 250 GB and the cluster had 9 nodes, each with 3 x 2 TB disks in addition to the 20 GB root disk.
HDFS tot…
-
I'm following the recent amplab tutorial using my own AWS account. Cluster launch finishes with an error "ERROR: Cluster health check failed for spark_ec2". I'd be grateful for pointers on how to so…
-
Flintrock's output is already much cleaner compared to spark-ec2's:
```
Launching 2 instances...
[52.91.67.xxx] SSH online.
[52.91.67.xxx] Installing Spark...
[52.91.213.xxx] SSH online.
[52.91.213.x…
-
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I have searched in the [issues](http…
-
With Spark cluster launched and started with ec2 script, is there the Hadoop cluster ready to go, or just an easy command to start it? The reason I am asking for this is - I would like to enable log a…
-
the / partition is only 8G, after running the script, there is no free space.
```
[root@ip-10-143-137-174 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G …
-
Right now it seems the latest hadoop version in spark-ec2 is 2.4, but actually in spark download page, it can be up to 2.7, also it is available in aws s3 http://s3.amazonaws.com/spark-related-package…
-
## Expected behavior
We should have all valid and working links and test for broken links
## Actual behavior
Without an automated link check sometimes broken links are added and others become…
-
I am getting the following error when I try to run the ETL script inspite of assigning the required libraries, PyGlue.zip to my PYTHONPATH. Could you tell me how to resolve this?
Traceback (most re…