Python Polars: The Definitive Guide
Welcome to the official repository of the book Python Polars: The Definitive Guide by Jeroen Janssens and Thijs Nieuwdorp.
The book is still being written and is scheduled to be published by O'Reilly in January 2025.
Description
Get ready to speed up your data analysis and start working with larger-than-memory datasets.
Polars offers a blazingly fast, multi-threaded, elegant API for data loading, manipulation, and processing.
Authors Jeroen Janssens and Thijs Nieuwdorp walk you through every aspect of Python Polars as they tackle practical use cases using real-world datasets.
You’ll not only learn the syntax, but also understand the underlying concepts.
You don’t need to have any experience with Pandas or Spark, but if you do, this book will help you make a smooth transition.
With this definitive guide at your side, you’ll be able to:
- Process larger-than-memory datasets at record speed
- Apply the eager, lazy, and streaming APIs of Polars and decide when to use which
- Transition smoothly from Pandas or Spark to Polars
- Integrate Polars into your existing codebase
- Work with Arrow and Parquet to efficiently read and write data
- Translate complex ETL tasks into efficient and elegant queries
Outline
Note that this outline is subject to change.
Front matter
- Foreword by Ritchie Vink, creator of Polars
- Acknowledgements
Part 1: Begin
- Introducing Polars
- First Steps
- Transitioning from Pandas to Polars
Part 2: Load
- Data Types and Data Structures
- Eager and Lazy APIs
- Reading and Writing Data
Part 3: Express
- Beginning Expressions
- Continuing Expressions
- Combining Expressions
Part 4: Transform
- Selecting and Creating Columns
- Filtering and Sorting Rows
- Working with Special Data Types
- Summarizing and Aggregating
- Joining and Concatenating
- Reshaping
Part 5: Advance
- Creating Visualizations
- Extending Polars
- Polars Internals