Fuenfgeld / 2022TeamADataEngineeringBC

This is a repository for a Data Engineering Tutorial
MIT License
0 stars 2 forks source link

Investigate Spark Cluster Mode #26

Closed Thunfischpirat closed 2 years ago

Thunfischpirat commented 2 years ago

Apparently there is a company called databricks that provides free allowance of cluster usage for Spark applications. I will look into this matter as part of my investigation of the viability of Spark as the topic of our tutorial. #23

Thunfischpirat commented 2 years ago

I have set up an account and tried out uploading some data and creating a cluster, on which I then ran a small PySpark script accessing the uploaded data. Since the data I uploaded wasn't very big, I couldn't see any noticeable improvements in speed over our notebook script. #23