Open leonkosak opened 8 months ago
This would take significant work as our integration with xlsxwriter
is quite deep (and would require integration on the Rust side as rust_xlsxwriter
does not provide a Python wrapper). So, this would not be configurable, it would need to be a complete replacement, and would require sufficient feature parity to map from one to the other.
@jmcnamara: Do you have a rough idea about the current state of feature parity between the xlsxwriter
Rust/Python versions? We have been considering taking calamine
bindings directly inside Polars for even faster Excel reading, so it's not out of the question that we might also revisit taking a dependency on something lightweight for write bindings too 🤔
Well, as far as I know, the fastest way of writing xlsx files in Python is via PyExcelerate, which is not very performant as well (and library almost abandoned as I can see). The only library which has potential in my opinion for using in Python for creating xlsx is rust_xlsxwriter by @jmcnamara. 👍
Do you have a rough idea about the current state of feature parity between the
xlsxwriter
Rust/Python versions?
I plan to have full feature completeness by the end of the year. Based on the completed feature list of the rust_xlsxwriter
Roadmap it is currently at 26/36 (~70%) complete. Based on ported tests it is ~1000/1600 (63%) complete.
However in terms of the functionality of polars.DataFrame.write_excel()
rust_xlswriter
is almost feature complete with XlsxWriter. The only feature missing is Sparklines.
To help this along I can do 1-2 things:
rust_xlsxwriter
so that it is feature complete with the XlsxWriter functionality of polars.DataFrame.write_excel()
. And/Or:polars_excel_writer
to make that more API compatible with polars.DataFrame.write_excel()
. This would make it easier for you and the other Polars devs to drop it in as a replacement for the Python version.I'll probably work on item 1 anyway but let me know what you think would be the best approach. I'll willing to put some time into making rust_xlswriter
as compatible as possible with Polars.
I'll probably work on item 1 anyway but let me know what you think would be the best approach. I'll willing to put some time into making
rust_xlswriter
as compatible as possible with Polars.
Given that I'd have to replicate the write_excel
API anyway, and it seems you have done most of the work already, I think it would be a great idea to take advantage of it. If you could get Sparklines in and we could actually port the internals wholesale to make use of polars_excel_writer
instead, that sounds quite compelling to me (I can poll the other devs for their thoughts, but I like the sound of it :)
@alexander-beedie Sounds good. Let's stay in sync. I'll get the sparklines support ported by the weekend and I'll follow up then.
@alexander-beedie Sounds good. Let's stay in sync. I'll get the sparklines support ported by the weekend and I'll follow up then.
Great, though don't rush on my account; I'm swamped at the moment 😅
hi, Any updates on this topic? :)
Any updates on this topic?
I implemented sparklines in rust_xlsxwriter
put didn't get a chance to port it to polars_excel_writer
. I haven't had much open source development time recently so I had to park it for a while. I hope to get started again in May.
For what it is worth here is the currently feature completion list for polars_excel_writer
: https://github.com/jmcnamara/polars_excel_writer/issues/1#issuecomment-1685299464
No problem. THank you for your great work! 👍
Right now write_excel
allows the user to pass in an open xlsxwriter.Workbook
object. Any chance that this can support openpyxl workbooks too?
Right now
write_excel
allows the user to pass in an openxlsxwriter.Workbook
object. Any chance that this can support openpyxl workbooks too?
Afraid not; they are entirely incompatible. You can write to a fresh workbook or to an open xlsxwriter
workbook (which allows you to write multiple frames to the same workbook, or enrich it with charts/etc), but you can't mix & match with different libraries.
Description
especially
rust_xlsxwriter
would bring some enormous performence benefits. What do you think?