Miscellaneous projects and scripts.

Author and contact: Luca.Canali@cern.ch

Spark and Performance Engineering

Folder	Description
Spark Dashboard	A tool for Apache monitoring, use to build a performance dashboard and troubleshoot Spark jobs.
Spark Notes	Miscellaneous tips and code snippets about Apache Spark.
Spark for Physics	Examples, with code and data of how Apache Spark can be used in the domain of High Energy Physics data analysis.
Performance Testing	Code and examples, includes: - A tool to run TPCDS at scale with PySpark and collect execution metrics - Tools for load-testing CPUs in writetn Python and Rust - Notes on how to use tooling for performace measurements

Folder	Description
Deep Learning Notes	Notes and examples on Deep Learning tools and related data pipelines.
Pyspark_SQL_Magic_Jupyter	How to write Jupyter SQL magic functions for PySpark and Spark SQL.
Trino and Presto on Jupyter	Example of using Trino or Presto on a Jupyter notebook.
PostgreSQL and YugabyteDB on Jupyter	Example of using PostgreSQL or YugabyteDB on a Jupyter notebook.
Oracle_Jupyter	Examples of how to query Oracle using Jupyter/IPython notebooks.
Impala_SQL_Jupyter	Examples of how to run SQL on Apache Impala using Jupyter/IPython notebooks.
SQL_color_Mandelbrot	How to use SQL to compute and display the Mandelbrot set with colors. Examples for Oracle and PostgreSQL.
PLSQL_Neural_Network	An example of how to deploy a DL serving engine for Oracle using PL/SQL.