ToucanToco / fastexcel

A Python wrapper around calamine
http://fastexcel.toucantoco.dev/
MIT License
116 stars 6 forks source link
arrow pandas polars python rust

fastexcel

A fast excel file reader for Python, written in Rust.

Based on calamine and Apache Arrow.

Docs available here.

Dev setup

Prerequisites

Python>=3.8 and a recent Rust toolchain must be installed on your machine. cargo must be available in your PATH.

First setup

On the very first time you setup the project, you'll need to create a virtualenv and install the necessary tools:

python -m venv .venv
source .venv/bin/activate
(.venv) make dev-setup

This will also set up pre-commit.

Installing the project in dev mode

In order to install the project in dev mode (for local tests for example), use make dev-install. This will compile the wheel (in debug mode) and install it. It will then be available in your venv.

Installing the project in prod mode

This is required for profiling, as dev mode wheels are much slower. make prod-install will compile the project in release mode and install it in your local venv, overriding previous dev installs.

Linting and formatting

The Makefile provides the lint and format extras to ease this.

Running the tests

make test

Running the benchmarks

Speed benchmark

make benchmarks

Memory benchmark

mprof run -T 0.01 python python/tests/benchmarks/memory.py python/tests/benchmarks/fixtures/plain_data.xls

Building the docs

make doc

Creating a release

  1. Create a PR containing a commit that only updates the version in Cargo.toml.
  2. Once it is approved, squash and merge it into main.
  3. Tag the squashed commit, and push it.
  4. The release GitHub action will take care of the rest.

Dev tips