Author and contact: Luca.Canali@cern.ch
Folder | Description |
---|---|
Spark Dashboard | A tool for Apache monitoring, use to build a performance dashboard and troubleshoot Spark jobs. |
Spark Notes | Miscellaneous tips and code snippets about Apache Spark. |
Spark for Physics | Examples, with code and data of how Apache Spark can be used in the domain of High Energy Physics data analysis. |
Performance Testing | Code and examples, includes: - A tool to run TPCDS at scale with PySpark and collect execution metrics - Tools for load-testing CPUs in writetn Python and Rust - Notes on how to use tooling for performace measurements |
Folder | Description |
---|---|
Deep Learning Notes | Notes and examples on Deep Learning tools and related data pipelines. |
Pyspark_SQL_Magic_Jupyter | How to write Jupyter SQL magic functions for PySpark and Spark SQL. |
Trino and Presto on Jupyter | Example of using Trino or Presto on a Jupyter notebook. |
PostgreSQL and YugabyteDB on Jupyter | Example of using PostgreSQL or YugabyteDB on a Jupyter notebook. |
Oracle_Jupyter | Examples of how to query Oracle using Jupyter/IPython notebooks. |
Impala_SQL_Jupyter | Examples of how to run SQL on Apache Impala using Jupyter/IPython notebooks. |
SQL_color_Mandelbrot | How to use SQL to compute and display the Mandelbrot set with colors. Examples for Oracle and PostgreSQL. |
PLSQL_Neural_Network | An example of how to deploy a DL serving engine for Oracle using PL/SQL. |