ArpitarRoy / php-excel-reader

Automatically exported from code.google.com/p/php-excel-reader
0 stars 0 forks source link

Last character of strings getting cut off [resolved?] #115

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I had a very strange issue.  I'm parsing a spreadsheet to import into mysql.  I 
have no control over the format or creation of the spreadsheet, it is being 
generated by a web application.  Digging through excel_reader2.php, it looks 
like the problem was occurring in the _parsesheet() function when the $code var 
matched case "SPREADSHEET_EXCEL_READER_TYPE_LABEL".  The existing code was this:

$this->addcell($row, $column, substr($data, $spos + 8, ord($data[$spos + 6]) | 
ord($data[$spos + 7])<<8));

I was able to get the missing character to show up by changing it to this:

$this->addcell($row, $column, substr($data, $spos + 8, ord($data[$spos + 6]) + 
1));

Honestly I don't understand the details of what these lines are doing, and I'm 
curious if anyone can tell me what was going on.  I suspect that these cells 
are improperly formatted in the spreadsheet I am receiving.  Curiously, if I 
open the spreadsheet in excel and resave it, everything works fine without any 
modification to the library.

Original issue reported on code.google.com by wright.a...@gmail.com on 22 Apr 2011 at 4:57

GoogleCodeExporter commented 9 years ago
I've been experiencing a similar issue, where the excel file is given to me by 
an SSRS server. Applying your patch does fix the problem but I think the issue 
is deeper than this.

A gut feeling is that the formatting/encoding of the excel file is 
wrong/incompatible. Although my text strings are no longer truncated, things 
like percentages are not getting converted. Its as if SSRS has set every field 
to type LABEL?

Original comment by arron.wo...@gmail.com on 13 May 2011 at 1:22

GoogleCodeExporter commented 9 years ago
Yes, this is exactly what is happening, in my debugging I've seen that the 
problem fields are all set as 'Labels'.  I've started working on a different 
spreadsheet that is giving me the same problem but my "fix" is not working on 
this one.  If I figure it out later this morning I will post an update.

Original comment by wright.a...@gmail.com on 13 May 2011 at 2:25

GoogleCodeExporter commented 9 years ago
This is a wild guess, I think the problem is that the ascii control characters 
are not correct.  Again, sorry if I use the wrong terminology this is all new 
to me.  When looking at the raw data in a cell I'm seeing this :

0x01 wanted text 0x04 x002

99% of the time the data in a cell is formatted this way, so I'll be able to 
just modify the library to pull a substr starting at that point and ending with 
0x04.  Sometimes though the string ends in 0x03 instead of 0x04 so I'll have to 
end my substr in whichever comes first, 0x04 or 0x03.  

It's a messy work around and there's probably a much better way to do this, but 
I can't spend much more time on this.

Original comment by wright.a...@gmail.com on 13 May 2011 at 2:46

GoogleCodeExporter commented 9 years ago
My workaround is now working, again here's the original problem piece of code:

$this->addcell($row, $column, substr($data, $spos + 8, ord($data[$spos + 6]) | 
ord($data[$spos + 7])<<8));

And here is my workaround:

$startpos = $spos + 9;
$endpos_for_sot = strpos($data, chr(2), $startpos); // 0x02
$endpos_for_eot = strpos($data, chr(4), $startpos); // 0x03
$endpos_for_etx = strpos($data, chr(3), $startpos); // 0x04

$len = $endpos_for_etx - $startpos;
if($endpos_for_eot - $startpos < $len)
     $len = $endpos_for_eot - $startpos;
if($endpos_for_sot - $startpos < $len)
     $len = $endpos_for_sot - $startpos;

$test_substr =  substr($data, $startpos, $len);

Original comment by wright.a...@gmail.com on 13 May 2011 at 2:58

GoogleCodeExporter commented 9 years ago
After analyzing a bit I figured out the issue: While the length is correctly 
reported with ($spos + 6 | $spos + 7 << 8), the string itself starts in 
$spos+9, not $spos+8 as it's shown in the code.

So changing the line
$this->addcell($row, $column, substr($this->data, $spos + 8, 
ord($this->data[$spos + 6]) | ord($this->data[$spos + 7])<<8));
for
$this->addcell($row, $column, substr($this->data, $spos + 9, 
ord($this->data[$spos + 6]) | ord($this->data[$spos + 7])<<8));
does the trick.

Original comment by Mauro.Sa...@gmail.com on 27 Jan 2012 at 8:28