Open jmcnamara opened 10 months ago
@jmcnamara Would you consider having the next version just rely on the Rust version through pyo3
to realize performance benefits for the Python interface?
Would you consider having the next version just rely on the Rust version through
pyo3
to realize performance benefits for the Python interface?
@max-muoto
I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.
However, I would see scope for a "lite" version of XlsxWriter + pyo3
with support for just writing data and formatting. Something that could be consumed by Pandas, for example, to speed up file writing. From rough initial benchmarks that could be about 8x faster than the pure Python version. I see that Pandas recently adopted a Rust backed xlsx reader based on Calamine so they might be open to a similar writer. I'll keep it in mind.
Would you consider having the next version just rely on the Rust version through
pyo3
to realize performance benefits for the Python interface?@max-muoto
I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.
However, I would see scope for a "lite" version of XlsxWriter +
pyo3
with support for just writing data and formatting. Something that could be consumed by Pandas, for example, to speed up file writing. From rough initial benchmarks that could be about 8x faster than the pure Python version. I see that Pandas recently adopted a Rust backed xlsx reader based on Calamine so they might be open to a similar writer. I'll keep it in mind.
Makes sense, thanks for the info!
I think a minimal version for compatibility with Polars/Pandas would be great. Polars also recently added support for Calamine as a reader, so I feel this is something that might be pretty open to as well.
Polars also recently added support for Calamine as a reader, so I feel this is something that might be pretty open to as well.
That is good to know.
I think a minimal version for compatibility with Polars/Pandas would be great.
Polars could take the Rust version directly. I wrote polars_excel_writer as a prototype for that and there has been some initial engagement with the Polars folks here.
For Pandas I started a PYO3 wrapper called xlsxwriter_lite. However, that is currently very rudimentary.
Polars could take the Rust version directly. I wrote polars_excel_writer as a prototype for that and there has been some initial engagement with the Polars folks here.
We've been thinking about taking calamine
as a direct Polars (Rust) dependency to squeeze every last possible drop of speed out of it; if/when we get around to that it might be time to revisit the writing side (though unless somebody suddenly gets a lot of unexpected free time this might take a while 😅)
Would you consider having the next version just rely on the Rust version through
pyo3
to realize performance benefits for the Python interface?I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.
Eventually, the Rust version may have equal or greater functionality than the Python version.
But I fully agree with @jmcnamara that having zero dependencies is desirable. In fact, it is a lifeline for those of us who want to use the full capabilities of XlsxWriter on systems with Python but no support for Rust. Even for systems that do support Rust, there will be some users who find the pure-Python XlsxWriter fast enough for their needs and would rather not introduce extra downloads or dependencies.
Previous roadmap
XlsxWriter is almost 10 years old. The first version was released was in February 17 2013. According to pypinfo it has around 12 million monthly downloads so it is probably fair to say that it has been useful.
Recently I have been porting/rewriting XlsxWriter in Rust and it has been an interesting experience. When I'm finished with the Rust port, sometime near the end of 2024, I'd like to revisit XlsxWriter and bring it up to date with modern Python and practice. Some ideas:
worksheet.insert_image_fit_to_cell()
method easier to implement.autofit()
.