Pre-merge request checklist (to be completed by the one making the request):
[x] I have performed a full review of this code myself.
For Python code in PySpark specific sections, all code should have been run in Jupyter notebooks.
For code in sections of the book containing both Python and R code, the page of the book should be constructed as described in the contributing guide and converted to a markdown file.
[x] I have formatted the outputs of code blocks correctly (to match other outputs in the book and in line with the style guide [coming soon])
[x] I have built the book as outlined in the contributing guide and confirmed that any additional/modified content is displaying as expected.
Details of this request:
Add a page of summary guidance/tips on working with big data to the front of the Spark Analysis section of the book
Things to note about this request:
The aim of the page is to be general enough to be useful to all users
Will also be linked to in CDP billing pages so users can start thinking about how to keep compute costs as low as possible
Requirements for review:
Content:
[ ] Content makes sense and is accurate
[ ] Best practice learned from case studies has been incorporated
[ ] Content is clear
[ ] No important information is missing
[ ] No typos
Formatting:
[ ] Page is formatted properly and displays correctly when building the book
[ ] Formatting is consistent with the rest of the book
[ ] Links work correctly
[ ] Figures display correctly and are clear/accurate
Pre-merge request checklist (to be completed by the one making the request):
Details of this request:
Things to note about this request:
Requirements for review:
Content:
Formatting: