JanMarvin / openxlsx2

openxlsx2 - read, write and modify xlsx files
https://janmarvin.github.io/openxlsx2/
Other
121 stars 11 forks source link

Col, row and dims support #652

Closed Michiel91 closed 1 year ago

Michiel91 commented 1 year ago

When styling a range written to a workbook from R I often only have row and col indices available and no Excel dims like A1:A15. I noticed that not all functions currently support providing these indices (e.g. wb_add_cell_style).

Would it be possible to support cols, rows and dims in all relevant functions, like it is very nicely implemented for wb_add_conditional_formatting?

If this is not desired or takes a lot of work to implement, below is an alternative suggestion for a short but useful conversion function I wrote:

row_col_to_dims <- function(cols = NULL,
                rows = NULL) {

    # Validate input
    if(!checkmate::test_numeric(x = cols, min.len = 1)) {
        stop("provide a valid column index or range of indices.")
    }

    if(!checkmate::test_numeric(x = rows, min.len = 1)) {
        stop("provide a valid row index or range of indices.")
    }

    # Convert column indices to names
    cols <- openxlsx2::int2col(cols)

    # Duplicate col and/or row value in case one was provided
    if(length(cols) == 1) cols <- rep(cols, 2)
    if(length(rows) == 1) rows <- rep(rows, 2)

    # Parse dims
    dims <- glue::glue("{head(cols, 1)}{head(rows, 1)}:{tail(cols,1)}{tail(rows,1)}")

    # Return Excel range
    return(dims)
}

Feel free to implement or adapt this function in case you think it would fit in openxlsx2.

Thanks for taking these suggestions into consideration.

JanMarvin commented 1 year ago

Hi @Michiel91 , we have opnxlsx2:::rowcol_to_dims(). it's not exported, but you can still use it. Is that helpful to you?

Michiel91 commented 1 year ago

Hi @JanMarvin,

Thanks! I missed that one since it wasn't exported. This function does exactly what I was looking for (and the same as my helper function). It might be worth exposing this function since I know from colleagues using openxlsx2 users that more users struggle with the dims parameter.

JanMarvin commented 1 year ago

Well I guess it doesn't really matter if we have one more export or not... Just as a note, I was considering the removal of the cols and rows arguments we carry for various functions, because I feel that the dims argument is enough. But since that might cause trouble for people moving over from xlsx and openxlsx we will keep it as is for now.

Michiel91 commented 1 year ago

Apart from people moving over from openxlsx or xlsx I think there's another more important reason why the col and row arguments are very useful. Ranges such as A1:A5 used in Excel are not used anywhere in R. When working with cols or rows in your data, you always use the indices (or names) to refer to a certain range. Likewise, when exporting data from R to an Excel workbook you almost always want to apply a certain styling or function of openxlsx2 to a certain range of cols and/or rows directly available in R.

Having the converter function exporter already makes it less troublesome to convert a range of rows/cols to Excel dimensions, but in my opinion being able to directly feed a range of rows/cols to all functions of openxlsx2 would be much more user friendly.

JanMarvin commented 1 year ago

Thanks for sharing your opinion, but I do not intend to make changes in that regard.

Michiel91 commented 1 year ago

That's a pity. If at least the exporter function could be exported that would already help your users a lot. Update: I just see it has been exported already, thanks for the commit!

JanMarvin commented 1 year ago

It is simply that rows and cols arguments may appear useful at first glance, but in my opinion they are not. Often cols and rows rather lead to the fact that 1-off errors are committed, because e.g. the colnames are not included and it is then difficult or even impossible to understand where these errors come from. Therefore I think it is easier to specify the areas explicitly. This is even more true if the areas that you want to edit do not start at the top left of a worksheet in A1 but are scattered somewhere on the worksheet.

For consistency reasons, I would prefer a single argument, which could then be specified alternatively with

dims = "A1:D2"

or with

dims = list(cols = "A:D", rows = 1:2)

But it's a bit late in the release cycle for me to do that (basically no time to deprecate is a huge no-no), because I want to release a version 1.0 with a stable API soon (an API that remains basically unchanged for lets say at least a year) and do not want to break everything that existed on the way. That means I have to take into account that there are inconsistencies and accept that some functions take dims and cols/rows as an argument.

Michiel91 commented 1 year ago

Thanks for the explanation. That indeed fully makes sense when importing and editing data already coming from Excel. In that case mistakes in the range are easily made when working with column and row indices.

What I was referring to is writing data from R to a new Excel workbook and then apply styling to it. In that case you do not have an Excel range, but only col and row indices. Because those are obtained with functions like _match(c("col_1","col_2"),names(mydf)) or just _1:ncol(mydf) there is no chance on 1-off errors or incorrect ranges. It would be great if we could directly work with those ranges instead of having to always convert them first. That's where my suggestion is coming from (as well as from feedback of colleagues here also using openxlsx2).

JanMarvin commented 1 year ago

Hi @Michiel91, I guess what you want is more or less a custom rowcol_to_dims() function. I don't see why this should not be possible for you to create, but right now I do not see a necessity to add this to the package. And I guess that openxlsx2 provides everything that is required for you to create such a function. In addition to rowcol_to_dims() we have int2col(), col2int(), and dims_to_dataframe() and you might have a look at the wb_data() thing I use in wb_add_pivot_table(). I feel comfortable that you should be equipped with every tool required to customize your dims creation. And this is what I do myself in various scripts I have created for either work, our Discussions section, or StackOverflow. Give yourself a chance to get familiar with the package flow and we can always revisit this after some time. Personally I'm (obviously) very happy with having only dims (and after all I'm the main audience for this project :smile:).

Michiel91 commented 1 year ago

Hi @JanMarvin, I think you misunderstood my last comment. I am not looking for a custom rowcol_to_dims function. The current function works just fine. I tried to explain why it would be useful to have native row and col index support in addition to the dim argument. When writing data from R to Excel files, there's no chance on 1-off valyues, and dropping the col and row arguments comes at the expense of user friendliness and automatability.

But it's of course completely your call to only maintain dims as the package author (interesting remark though that you're the main audience for a public R package).

Anyway, thanks that you took the time to respond to this ticket, much appreciated!

JanMarvin commented 1 year ago

Okay, well in this case I think we are spinning in circles.

But, allow me to elaborate on the aspect you highlighted regarding the "main audience." For nearly two decades, I have been involved in the realm of open source. Actively contributing for about a decade, and hopefully my prompt response sand unwavering dedication to resolving issues in my public git repositories unequivocally demonstrate my high regard for every individual using my code. I often go to great lengths, even at the expense of my personal time, to support everyone.

However, I believe that it is very important to recognize that open source developers are not a public resource working for and fulfilling requests for the general public. They still work for those that compensate them for their time. Some developers may be employed by companies that invest in open source with the intention of selling a product, such as Red Hat or Rstudio. Others, like myself, engage in open source development independently. I do it because I require the tools, because I have a passion for learning, and because I believe in the principles of open source. Nonetheless, I choose to fund my open source projects personally, without soliciting donations or licensing fees. I have no desire to sell you anything or seek your financial support. I have a separate job that covers my financial needs, and in that context, my employer is my primary audience. However, in the realm of my open source projects, I am my main audience, fulfilling my own needs and sharing my code publicly so that you can use it, study it, and modify it to your needs.

Michiel91 commented 1 year ago

Thanks @JanMarvin, I get your point and it makes complete sense. I also fully agree that open source developers should not be seen as a resource obliged to fulfil all requests from the general public. A lot of all the great tools out there are open source developed by volunteers devoting a lot of their personal time to it. And all of the users of these great tools are very grateful for that (including me)! The remark that you were your main audience just sounded very strange to me, but I understand what you meant with it now.