jandrew / Spreadsheet-XLSX-Reader-LibXML

Read spreadsheet files with xlsx extentions
Other
4 stars 2 forks source link

Merge issues #52

Closed Tux closed 9 years ago

Tux commented 9 years ago

$oWkS->get_merged_areas is missing. I used

    *WorksheetInstance::get_merged_areas = sub {
        my $mm = shift->_merge_map or return;
        # [ undef,
        #   [ undef,
        #     undef,
        #     'B1:C2',
        #     'B1:C2'
        #     ],
        #   [ undef,
        #     'A2:A3',
        #     'B1:C2',
        #     'B1:C2'
        #     ],
        #   [ undef,
        #     'A2:A3'
        #     ]
        #   ]
        # ->
        # [ [ 1, 0,     # A2:
        #     2, 0,     #  A3
        #     ],
        #   [ 0, 1,     # B1:
        #     1, 2,     #  C2
        #     ]
        #   ]
        my %r;
        for (@$mm) { $_ && $r{$_}++ for @$_ };
        keys %r or return;
        my @r;
        foreach my $ma (keys %r) {
            my ($ul, $br) = split m/:/ => $ma or return;
            push @r, [ reverse map { $_ - 1 } map { cell2cr ($_) } $br, $ul ];
            }
        return \@r;
        };

To get the same behavior as Spreadsheet::ParseExcel and Spreadsheet::ParseXLSX

Given that that information returns the correct merge information for a sheet, the values returned by the $cell->values for non-home cells of merged areas is undefined. Given

    A   B   C
  +---+-------+
1 |   |       |
  +---+       |
2 |   |       |
  |   +---+---+
3 |   |   |   |
  +---+---+---+

will have areas B1:C2 and A2:A3 as merged areas. The values returned for C1, B2, C2, and A3 are undef. The other two spreadsheet readers return the home-cell value (B1 and A2 resp).

jandrew commented 9 years ago

This is definitely one of the pending format method matching pieces i have not completed. It's definitly on my list but a little lower priority. I can bump it up but I don't think I will get to it this week.

jandrew commented 9 years ago

I would add that in my initial testing that Microsoft Excel returns the merge range acknowledgement for a call to cell C1 but does not return the value of cell B1 for the result of the formula "=C1". I will have to do some additional testing to verify that but I'm inclined to default to Microsoft behavior over output matching ParseExcel. It may be that an attribute would allow the user to select either behavior but that increases the complexity of this issue. I'm definitely open to feedback on this one but I think that new users are at risk when departing from say a vbscript equivalent output or even a Win32::OLE equivalent output. I think that ~::ParseExcel remains the gold standard I just think that it isn't always clear for people making the transition from Excel application based parsing what is going to come out.

jandrew commented 9 years ago

This is similar to the issue where ~::ParseExcel returns empty strings '' for undef values in cells where Microsoft Excel using vb-script will tend to return undef. Currently this is handled in my package with an attribute and it may make sense to roll all these differences up into a 'use' flag for ~::LibXML ':ParseExcel_like' (or some such).

jandrew commented 9 years ago

So based on my testing neither vbscript or Spreadsheet::ParseExcel return the home values as the contents of merged cells for non-home positions in the merge range. As a consequence I will no-fix the second part of the request to have non-home cells return the value of home cells. I think it's do-able but would need to be called out with an attribute so that people knew it was unique. Additionally any request for this will be treated as a feature request and goes on the priority list behind any bugs and additional features already on the TODO list.

Tux commented 9 years ago

That is an acceptable stance. I'll see how hard it would be for Spreadsheet::Read to mimic that required behavior

jandrew commented 9 years ago

Thanks for understanding. My initial thought was to cache the shared value along with the merge range but in the case where a non-home cell is called first the solution is non-trivial. I actually have the method add request for this one fixed on my development box but I'm currently researching issue #53 to asses how much work it is and if I think I can finish this weekend. If so I'll wait and push both to github as v0.38.12 if I think it will be done by monday.

Tux commented 9 years ago

Turned out I already did. FWIW I uploaded a new Spreadsheet::Read release to CPAN today, supporting most of your parser (still needs work on colors and formats) $ prove -vwb t/63* to see what is still missing

jandrew commented 9 years ago

Thanks,

jandrew commented 9 years ago

As far as I can tell this issue is closed by pull request #66.

Tux commented 9 years ago

Sorry to re-open, but I do get merge info for files/attr.xlsx, but none for files/merged.xlsx Am I missing something?

jandrew commented 9 years ago

Let me look into this. I'm cleaning up the test suite today in preparation for the next github /CPAN release and I will check into this.

jandrew commented 9 years ago

OK, this was definitly a bug and I have added testing of merged.xlsx to the test file 52-merge_function_alignment. to catch it.