Closed jmcnamara closed 1 year ago
Can I take a shot at this?
@moutikabdessabour At this stage I'm mainly asking if people think this would be useful. Do you think it would be useful?
I'm currently 50/50 on this, for the reasons below, but I'm willing to take input.
Pros:
Cons:
write()
method in the Python and Perl versions tend to generate bug reports when users try to write arbitrary sting or number like data. People also assume that writing something like "2003-01-01" should become a date in Excel. For the second con I think implementing a Formula/Url type would remove the overhead. I wrote an app using this library and while it is great the only issue I had is the verbosity of the function calls. I am a 100% for this feature
I wrote an app using this library and while it is great the only issue I had is the verbosity of the function calls.
Yes, that is my feeling too. There are currently 24 write_*()
methods. That is just too many.
It could be mapped down to 5 like this:
Current | New |
---|---|
write_number() |
write() |
write_string() |
write() |
write_url() |
write() |
write_boolean() |
write() |
write_formula() |
write() |
write_rich_string() |
write() |
write_blank() |
write() |
write_time() |
write() |
write_date() |
write() |
write_datetime() |
write() |
write_dynamic_formula() |
write() |
write_number_with_format() |
write_with_format() |
write_string_with_format() |
write_with_format() |
write_url_with_format() |
write_with_format() |
write_boolean_with_format() |
write_with_format() |
write_formula_with_format() |
write_with_format() |
write_rich_string_with_format() |
write_with_format() |
write_dynamic_formula_with_format() |
write_with_format() |
write_array_formula() |
write_array_formula() |
write_dynamic_array_formula() |
write_array_formula() |
write_array_formula_with_format() |
write_array_formula_with_format() |
write_dynamic_array_formula_with_format() |
write_array_formula_with_format() |
write_url_with_options() |
Same |
write_url_with_text() |
Same, or write_url_with_options() |
With some caveats/additions:
Url
type.Formula
type that can be set to array or dynamic array (if needed)Also a IntoExcelType
(or similar) trait coupled with a generic write()
would have the advantage of allowing mapping of other arbitrary user types in a clean way.
Now, that I've mapped it out like this it seems like the right thing to do (even if it is another major API break). To avoid breakage the type specific methods could stay but be undocumented/hidden. They would be needed for the trait mapping anyway.
I might add it after I finish the current work on charts.
Gonna take a jab at this later today. WIll probably have something ready tomorrow night.
Gonna take a jab at this later today. WIll probably have something ready tomorrow night.
There is no need. I will get to it myself after the current release.
BTW that isn't meant to deter you from doing a prototype. I just want to flag that it will needs a lot of doc and example refactoring and I'd prefer to do that myself.
i would love the feature as it would enable me for example to write one function to write json data in array format (structured differently) with having to write many functions for a specific struct to write json to excel. If there is any way i can archive this can please show me how to and also you dim this feature as unnecessary then can you link me a rs lib that enable me todo so if its available.
@GlennMm At this point I'm definitely going to add it. Either in the next release or a separate release just after.
@GlennMm At this point I'm definitely going to add it. Either in the next release or a separate release just after.
Thank you @jmcnamara that would awesome, looking forward to this feature
I've added an initial version of this feature to main. It needs some additional docs and examples before it is ready for release.
It allows 2 new important features. The first is a generic write()
method to simplify code:
use rust_xlsxwriter::{Workbook, XlsxError};
fn main() -> Result<(), XlsxError> {
// Create a new Excel file object.
let mut workbook = Workbook::new();
// Add a worksheet to the workbook.
let worksheet = workbook.add_worksheet();
// Write a string to cell (0, 0) = A1.
worksheet.write(0, 0, "Hello")?;
// Write a number to cell (1, 0) = A2.
worksheet.write(1, 0, 12345)?;
// Save the file to disk.
workbook.save("hello.xlsx")?;
Ok(())
}
Output:
The second feature is that you can now use the same mechanism to extend write()
for your own data types:
//! Example of how to extend the the rust_xlsxwriter write() method using the
//! IntoExcelData trait to handle arbitrary user data that can be mapped to one
//! of the main Excel data types.
use rust_xlsxwriter::*;
fn main() -> Result<(), XlsxError> {
// Create a new Excel file object.
let mut workbook = Workbook::new();
// Add a worksheet to the workbook.
let worksheet = workbook.add_worksheet();
// Make the first column wider for clarity.
worksheet.set_column_width(0, 12)?;
// Write user defined type instances that implement the IntoExcelData trait.
worksheet.write(0, 0, UnixTime::new(0))?;
worksheet.write(1, 0, UnixTime::new(946598400))?;
worksheet.write(2, 0, UnixTime::new(1672531200))?;
// Save the file to disk.
workbook.save("write_generic.xlsx")?;
Ok(())
}
// For this example we create a simple struct type to represent a Unix time.
// This is the number of elapsed seconds since the epoch of January 1970 (UTC).
// See https://en.wikipedia.org/wiki/Unix_time. This type isn't handled by
// default by rust_xlsxwriter.
pub struct UnixTime {
seconds: u64,
}
impl UnixTime {
pub fn new(seconds: u64) -> UnixTime {
UnixTime { seconds }
}
}
// Implement the IntoExcelData trait to map our new UnixTime struct into an
// Excel type.
//
// The relevant Excel type is f64 which is used to store dates and times (along
// with a number format). The Unix 1970 epoch equates to a date/number of
// 25569.0. For Unix times beyond that we divide by the number of seconds in the
// day (24 * 60 * 60) to get the Excel serial date.
//
// We must also supply a number format if one isn't specified.
//
impl IntoExcelData for UnixTime {
fn write(
self,
worksheet: &mut Worksheet,
row: RowNum,
col: ColNum,
) -> Result<&mut Worksheet, XlsxError> {
// Create a default date format.
let format = Format::new().set_num_format("yyyy-mm-dd");
// Convert the Unix time to an Excel datetime.
let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;
// Write the date as a number with a format.
worksheet.write_number_with_format(row, col, datetime, &format)
}
fn write_with_format<'a>(
self,
worksheet: &'a mut Worksheet,
row: RowNum,
col: ColNum,
format: &'a Format,
) -> Result<&'a mut Worksheet, XlsxError> {
// Convert the Unix time to an Excel datetime.
let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;
// Write the date with the user supplied format.
worksheet.write_number_with_format(row, col, datetime, format)
}
}
Output:
Comments/reviews welcome.
Also, if the Format was treated as an Option
it would be possible to have just one method that needs to be implemented for the trait:
impl IntoExcelData for UnixTime {
fn write<'a>(
self,
worksheet: &'a mut Worksheet,
row: RowNum,
col: ColNum,
format: Option<&'a Format>,
) -> Result<&'a mut Worksheet, XlsxError> {
// Convert the Unix time to an Excel datetime.
let datetime = 25569.0 + (self.seconds / (24 * 60 * 60)) as f64;
match format {
Some(format) => {
// Write the date with the user supplied format.
worksheet.write_number_with_format(row, col, datetime, format)
}
None => {
// Create a default date format.
let format = Format::new().set_num_format("yyyy-mm-dd");
worksheet.write_number_with_format(row, col, datetime, &format)
}
}
}
}
This may, or may not, be more confusing to the end user. Internally it works out a lot cleaner. This code is on the generic_with_option
branch for comparison.
@Fight-For-Food Any thoughts on the interface or implementation of this feature? You had some good input on other generic/non-generic code.
@Fight-For-Food Any thoughts on the interface or implementation of this feature? You had some good input on other generic/non-generic code.
I think that
fn write(self, worksheet: &mut Worksheet, row: RowNum, col: ColNum) -> Result<&mut Worksheet, XlsxError>{....}
is ok. It is convenient to use. And there is no need to make in more complex with extra-argument format: Option<&'a Format>
For some more comprehensive thoughts i need more time. Maybe i will have some at this week. But can't make a promise about it.
@Fight-For-Food
And there is no need to make in more complex with extra-argument
format: Option<&'a Format>
It may not be clear from the snippet above but the addition of the Option<>
means that there is only one trait funtion that needs to be implemented by the end user. However, it would probably be confusing and the version on main with IntoExcelData::write()
and IntoExcelData::write_with_format()
mimics the worksheet APIs may be simpler.
Maybe i will have some at this week. But can't make a promise about it.
Thanks. No problem if you can't get to it.
thank you for implementing this, it really help make my work much easier. looking to forward to be start contributing to this repo. so have started using this feature i haven't seen any problems with it so far
@jmcnamara Looks great, You could allow the users to define only one function but get access to both write
and write_with_format
, the trait definition will have three functions the write
, write_with_format
and an internal function that would actually implement the logic with the format arg passed in as an option
.
trait IntoExcelData {
fn internal_write<'a>(
self,
worksheet: &'a mut Worksheet,
row: RowNum,
col: ColNum,
format: Option<&'a Format>,
) -> Result<&'a mut Worksheet, XlsxError>;
fn write(
self,
worksheet: &mut Worksheet,
row: RowNum,
col: ColNum,
) -> Result<&mut Worksheet, XlsxError>{
self.internal_write(&mut worksheet, row, column, None)
}
fn write_with_format<'a>(
self,
worksheet: &'a mut Worksheet,
row: RowNum,
col: ColNum,
format: &'a Format,
) -> Result<&'a mut Worksheet, XlsxError>{
self.internal_write(&mut worksheet, row, column, Some(format))
}
}
Just a heads up that the initial version of this is upstream in crates.io in version 0.27.0: https://crates.io/crates/rust_xlsxwriter
I'll iterate and improve on this in the next few releases.
Having taken a small detour into charts I've come back to complete this.
Some tasks, not in order:
impl Into<String>
instead of &str in method parameters where possible/sensible.write_row()
and write_column()
methods.Url
and Formula
and add them to generic write()
.Implement generic write_row() and write_column() methods.
I've added a first pass at this to main. You can now write arrays of data types that implement IntoExcelData
like this:
use rust_xlsxwriter::{Workbook, XlsxError};
fn main() -> Result<(), XlsxError> {
// Create a new Excel file object.
let mut workbook = Workbook::new();
// Add a worksheet to the workbook.
let worksheet = workbook.add_worksheet();
// Some array data to write.
let numbers = [1, 2, 3, 4, 5];
let words = ["Hello"; 5];
let matrix = [
[10, 11, 12, 13, 14],
[20, 21, 22, 23, 24],
[30, 31, 32, 33, 34],
];
// Write the array data as columns.
worksheet.write_column(0, 0, numbers)?;
worksheet.write_column(0, 1, words)?;
// Write the array data as rows.
worksheet.write_row(0, 3, numbers)?;
worksheet.write_row(1, 3, words)?;
// Write the matrix data as an array or rows and as an array of columns.
worksheet.write_row_matrix(3, 3, matrix)?;
worksheet.write_column_matrix(6, 0, matrix)?;
// Save the file to disk.
workbook.save("arrays.xlsx")?;
Ok(())
}
Output:
It is just some syntactic sugar around a loop but it is somewhat popular in the Python version.
It currently has an additional deference/copy and I'm not sure if that can be avoided. Feedback welcome on that or anything else.
I gave some feedback on the last commit. I think that feedback applies on more parts of the codebase.
@adriandelgado
I think the Copy trait bound could be avoided by using the IntoIterator trait.
I initially tried, and failed, to get that working. :-( With my morning brain and 2 cups of coffee I managed to understand/fix it. You can have a look.
I gave some feedback on the last commit. I think that feedback applies on more parts of the codebase.
I think I understood those comments but maybe not so let's check. What I think you are saying is that the API/library should avoid taking any references to data that will then be cloned and should instead use interfaces like Into<String>
or IntoInterator
. Is that correct or am I missing something else?
I've also added write_row_matrix()
and write_column_matrix()
methods for 2D arrays/iterators of data. See the updated example above.
I think I understood those comments but maybe not so let's check. What I think you are saying is that the API/library should avoid taking any references to data that will then be cloned and should instead use interfaces like
Into<String>
orIntoInterator
. Is that correct or am I missing something else?
That's right. The caller of the function has the responsability to clone if needed. Sometimes the caller already has owned data and knows it wont need it anymore, no clone needed in that case.
I've implemented a Formula
type so that it can be used in the generic write()
:
worksheet.write(0, 0, Formula::new("=1+2+3"))?;
worksheet.write(1, 0, Formula::new("=SIN(PI())"))?;
However, I'm not sure how much of a help this is yet.
I had hoped to be able to use Trait Objects and the newer array writing methods to simulate something like writing a heterogeneous vec of vectors like a dataframe.
Something like this:
let data: Vec<Vec<Box<dyn IntoExcelData>>> = vec![
vec![
Box::new("East"),
Box::new("West"),
Box::new("North"),
Box::new("South"),
],
vec![
Box::new(40),
Box::new(40),
Box::new(20),
Box::new(30)],
vec![
Box::new(30),
Box::new(30),
Box::new(50),
Box::new(30)],
vec![
Box::new(Formula::new("=B1+C1")),
Box::new(Formula::new("=B2+C2")),
Box::new(Formula::new("=B3+C3")),
Box::new(Formula::new("=B4+C4")),
],
];
However, I've failed in every attempt so far. Examples like this with a simpler data structure:
use rust_xlsxwriter::{Formula, IntoExcelData, Workbook, XlsxError};
fn main() -> Result<(), XlsxError> {
// Create a new Excel file object.
let mut workbook = Workbook::new();
// Add a worksheet to the workbook.
let worksheet = workbook.add_worksheet();
let data: Vec<Box<dyn IntoExcelData>> = vec![
Box::new("East"),
Box::new(50),
Box::new(30),
Box::new(Formula::new("=B1+C1")),
];
/* The following doesn't work.
--> examples/app_trait_array.rs:39:9
|
40 | item.write(worksheet, row, 1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the size of `dyn IntoExcelData`
cannot be statically determined
*/
let mut row = 1;
for item in data {
item.write(worksheet, row, 1);
row += 1;
}
// Save the file to disk.
workbook.save("arrays.xlsx")?;
Ok(())
}
Or this:
let mut row = 1;
for item in data {
worksheet.write(row, 0, item)?;
row += 1;
}
Which gives errors like this:
--> examples/app_trait_array3.rs:28:33
|
28 | worksheet.write(row, 0, item)?;
| ----- ^^^^ the trait `IntoExcelData` is not implemented
for `Box<dyn IntoExcelData>`
| |
| required by a bound introduced by this call
|
= help: the following other types implement trait `IntoExcelData`:
Anyone see a way that trait objects for the IntoExcelData
trait might be handled?
Update: I'm still interested in this if there is a solution (there may not be a clean solution) but I'm going to shift my focus to working with real Dataframes from Polars.
See also this example of writing a Polars dataframe to Excel using rust_xlsxwriter
: #39
These generic methods and types are now available in release v0.39.0.
It would be possible to have a generic
write()
method using a trait that could call the type specificwrite_*()
method such aswrite_number()
. This is a feature in the Python version of the library.I'll leave this here as a reminder to myself and to see if it gets an +1s for usefulness or desirability.