dfe-analytical-services / analysts-guide

A static website to hold guidance, helpful links and code examples for analysts working in DfE.
https://dfe-analytical-services.github.io/analysts-guide/
MIT License
5 stars 3 forks source link

Update page on Databricks personal clusters #79

Closed jen-machin closed 2 months ago

jen-machin commented 3 months ago

Is your feature request related to a problem? Please describe. Following on from issue #57 we need to also edit the existing guidance page about personal clusters that currently lives here: https://dfe-analytical-services.github.io/analysts-guide/ADA/databricks_rstudio_personal_cluster.html The initial part of issue 57 will be resolved by PR #78 which deals with SQL warehouses which will probably be used by most analysts with existing pipelines because they can do an almost like-for-like replacement of the code they currently use to connect to tables in SQL Server.

Describe the solution you'd like The content of the page needs to be reframed so it's not just about setup, it's also about

We also need to consider whether we want to discuss the use of sparklyr with a personal cluster. It's not required and means that thr user needs to have Python installed locally to get it working. In the spirit of being the Stats Development Team then I think this is a thing that we could/should cover although with caveats where required. It's not clear to me currently how many people would end up using this method regularly.

Additional context Worth reading through PR #78 to see the extensive discussion in the comments!