mbleron / ExcelGen

ExcelGen is an Oracle PL/SQL utility to create MS Excel files (.xlsx, .xlsb)
MIT License
43 stars 9 forks source link

support ANYDATA type columns as long as subtype is one of date, numbe… #26

Closed lee-lindley closed 1 year ago

lee-lindley commented 2 years ago

Hi Marc, My use case is pulling data from a user provided spreadsheet where the user can put text in data and number columns. I need to postprocess SOME of that data, adding to it, and generate a new spreadsheet. For example they might put N/A in a date column and want that preserved. Annoying but fairly common. I use Anton Scheffer's as_read_xlsx to pull the data out of their spreadsheet (will be releasing a wrapper for it). I build a cursor that has ANYDATA columns so that any given cell can be number, date or varchar2 .

test_cases/test_anydata.sql has an example of using ANYDATA columns. Biggest drawback is must do a setColumnFormat on any column that is primarily date or Number. Otherwise dates come out as default numbers format. There may be a better way to handle that but I got lost in the code that creates xfId stuff. I don't see how I can handle it at the time I inspect the content of an individual cell and write it out.

I verified it is working as I expect for both XLSX and XLSB.

I changed desc_tab2 to desc_tab3 which was available in the earliest version we support (11). I can't see any downside. Other changes are isolated to handling the new input data type.

If you are willing to move forward with it, I am willing to add the other data types that could be inside an ANYDATA. I don't have a use case for them, but it won't be hard to do if you want that.

Please let me know if you have questions or concerns or simply don't want to do it.

mbleron commented 2 years ago

Hi Lee, Thanks for submitting this new PR. Unfortunately, I'm in the process of a major refactoring, so I won't be able to merge it as is.

I use Anton Scheffer's as_read_xlsx to pull the data out of their spreadsheet

Why not ExcelTable? (getRawCells)

I don't see how I can handle it at the time I inspect the content of an individual cell and write it out.

Indeed, in the current version, the data type of a given column is known before actually fetching data from the cursor, so the format can be tied to the column at that time. In the case of an ANYDATA column where the actual type may vary for each row, we have to duplicate what is done in createWorksheet with defaultFmt variable in order to assign the correct format.

In the upcoming version, I'll support a cell-wise API where format can be specified individually for each cell, or inherited from column settings. Handling ANYDATA is a good enhancement though, so I'll probably add it from my side.

lee-lindley commented 2 years ago

Acknowledged. Looking forward to your next version. Grab the anydata test case sql I built. It might save you a little time.

I'll look at ExcelTable.

Lee Lindley @.***

On Aug 14, 2022 at 10:16 AM, Marc Bleron @.***> wrote:

Hi Lee, Thanks for submitting this new PR. Unfortunately, I'm in the process of a major refactoring, so I won't be able to merge it as is.

I use Anton Scheffer's as_read_xlsx to pull the data out of their spreadsheet

Why not ExcelTable? (getRawCells)

I don't see how I can handle it at the time I inspect the content of an individual cell and write it out.

Indeed, in the current version, the data type of a given column is known before actually fetching data from the cursor, so the format can be tied to the column at that time. In the case of an ANYDATA column where the actual type may vary for each row, we have to duplicate what is done in createWorksheet with defaultFmt variable in order to assign the correct format.

In the upcoming version, I'll support a cell-wise API where format can be specified individually for each cell, or inherited from column settings. Handling ANYDATA is a good enhancement though, so I'll probably add it from my side.

— Reply to this email directly, view it on GitHub https://github.com/mbleron/ExcelGen/pull/26#issuecomment-1214387226, or unsubscribe https://github.com/notifications/unsubscribe-auth/AU4D5LBANGS3YV7DUQXSYTLVZD5VNANCNFSM56ORPHTQ . You are receiving this because you authored the thread.Message ID: @.***>

lee-lindley commented 2 years ago

“Why not ExcelTable? (getRawCells)” My use case is to get dirty spreadsheet data (strings in the middle of date columns, etc…) in and back out again to Excel (using the most excellent ExcelGen version 3 with new support for AnyData input!). Along the way I’ll use the more well behaved data I need from the input spreadsheet (by contract with the users) to add information to the resultset, but on output I need to preserve the spreadsheet content we receive. I also want to support the use case that the input spreadsheet can change. They may add or remove columns. As long as the user keeps the columns I need for processing intact, I need to preserve whatever they give me and add to it. I can get the column headers from the first row (which is simple enough with getRawCells), then construct a query that pivots the raw cell content of the data into rows using those column header names as identifiers. This is in-line with what getRows() does; however, I need

  1. to keep the anyData type for the cell value.
  2. Use the found column names to construct the resultset It feels like the right way to do this is to modify getRows() to support anyData type. An overload of the call can construct the p_cols structure and p_range from the column headers. Unfortunately, I get lost in the bowels of ExcelTable. I can handle the anyDataSet stuff – it is how you have internally built and use the column_list structures where I can’t quite get my head around it. I set out to add AnyData as an allowed type and am not sure where to even begin. I may go back to it one more time, but this morning I’m feeling like throwing in the towel. I can construct a standalone package/type implementing AnyDataSet . It can call getRawCells from dynamic sql to do everything I want. It would not be as efficient but is at least something I could do. Honestly, for the latter approach as opposed to overloading getRows(), ExcelTable does not offer much advantage over as_read_xlsx. ExcelTable has more potential, but perhaps not for my immediate use case. Thoughts?

-- Lee Lindley @.***

mbleron commented 2 years ago

Well, since getRawCells readily returns a collection of ANYDATA values, you can probably use it to read everything from the input spreadsheet and write it back to a new one using ExcelGen Cell API, without resorting to an intermediate set of rows.

mbleron commented 1 year ago

Implemented in v3.0. See Variant column data type example. Number and datetime instances will be formatted according to workbook or sheet-level settings.

lee-lindley commented 1 year ago

Outstanding!

-- Lee Lindley @.***

From: Marc Bleron @.> Sent: Wednesday, November 2, 2022 5:18 PM To: mbleron/ExcelGen @.> Cc: Lee Lindley @.>; Author @.> Subject: Re: [mbleron/ExcelGen] support ANYDATA type columns as long as subtype is one of date, numbe… (PR #26)

Implemented in v3.0. See Variant column data type https://github.com/mbleron/ExcelGen#variant-column-data-type example. Number and datetime instances will be formatted according to workbook or sheet-level settings. — Reply to this email directly, view it on GitHub https://github.com/mbleron/ExcelGen/pull/26#issuecomment-1301280977 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AU4D5LDLEKZEMCYQ7LJ3ZFDWGLK67ANCNFSM56ORPHTQ . You are receiving this because you authored the thread. https://github.com/notifications/beacon/AU4D5LEL4L33NYJ4EGSTMBDWGLK67A5CNFSM56ORPHT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOJWH7RUI.gif Message ID: @. @.> >