Closed sergio-rivas closed 3 years ago
what software created the original file?
The file was created with Windows version of Microsoft Excel, it's a .xls file too. I think it might have to do with the fact that the file has many use of merged cells.
I tried two simple tests to create an easy replication:
1) open the file and use spreadsheet to write to new file -> no problem
2) open the file and use spreadsheet to edit only one cell -> issue replicated
Would I be able to privately send the XLS file so you can confirm the issue? I don't mind helping out by sponsoring the bugfix too. Let me know how much.
Please try to open and save the file with LibreOffice and then try to edit it again using spreadsheet. Does that work?
After saving w/ LibreOffice, now spreadsheet gem cannot read the excel.
Error:
Spreadsheet::Errors::UnknownCodepage: Unknown Codepage 0xac8d
from .../ruby/gems/2.6.0/gems/spreadsheet-1.2.7/lib/spreadsheet/excel/reader.rb:75:in `block in encoding'
Did you save the file as UTF-8? Which version of LibreOffice and which OS are you using?
LibreOffice 7.1.0.3, Mac OSX.
There is no option to choose UTF-8 when saving as xls. The resulting file can be opened in LibreOffice, but is not readable by spreadsheet gem. Can I email you the original file and you try to open/save with your libre office version, then read it in spreadsheet gem? This way we can eliminate the possibility of environmental factors?
Please share the Ruby script with which you want to open the file. Also look at this commit here: https://github.com/zdavatz/spreadsheet/blob/master/History.md#111--03012016 where we had a similar error. Maybe you can send a Pull Request. This is the commit: e59b3374eefcf1fa8c7e75e31a3b36365f2078a8
I tried digging into the code myself, but was unable to get anywhere. Is there a way to know exactly what encoding LibreOffice is using when saving?
I tried setting the unknown codepage to MACROMAN, but would get invalid byte sequence errors, I tried UTF-8, and didn't get invalid byte sequence errors, but in BIFF8, it kept encountering a lot of errors due to nil values here and there.
For me the question is which language/character uses that "strange" characterset in your XLS?
It's all alphanumeric characters. Language is set to English (US), I changed all items to font "Times New Roman"
I even tried running a macros to enforce all alphanumeric characters only.
REM ***** BASIC *****
Function AlphaNumericOnly(strSource As String) As String
Dim i As Integer
Dim strResult As String
For i = 1 To Len(strSource)
Select Case Asc(Mid(strSource, i, 1))
Case 32 To 32, 48 To 57, 65 To 90, 97 To 122: 'include 32 if you want to include space
strResult = strResult & Mid(strSource, i, 1)
End Select
Next
AlphaNumericOnly = strResult
End Function
Function CleanSheet(sheetName As String) As Boolean
Dim sh As Object, z As Object, c As Object
Dim qCells As Object, enuCells As Object
sh = ThisComponent.Sheets.getByName(sheetName)
z = sh.getCellRangeByName("A1:U300")
qCells = z.queryContentCells(-1)
enuCells = qCells.Cells.createEnumeration
Do While enuCells.hasMoreElements
c = enuCells.nextElement
If Left(c.Formula, 1) = "=" Then
Else
c.string = AlphaNumericOnly(c.String)
End If
Loop
CleanSheet = true
End Function
Sub CleanAll()
CleanSheet("GRID")
CleanSheet("RESULTS")
CleanSheet("PICTURES")
End Sub
Which Windows Version and which MS Excel Version created the orginial file? Can these people try to create the same file using LibreOffice on Windows?
After some messing around with different settings, the following combination fixed my issue:
Ok, good to know, thank you for reporting!
When parsing a complex structure excel file (i.e. lots of merges, etc), then modifying a few specific cells, the output file changes the cell locations of many unrelated cells around the sheet (both row/column position are being shifted for other cells).
Is there anything I should provide to help you be able to replicate the issue?