Error reading .xlsx Workbook has no worksheets

Matpli commented 2 years ago

I am trying to read a xlsx file but i have a error:

Error in read.xlsx.default(xlsxFile = xlsxFile, sheet = sheet, startRow = startRow, : Workbook has no worksheets

If I open and save before the file i can read the file, but I need to read it without open it before, it's possible ?

I am using the version '4.2.5’

JanMarvin commented 2 years ago

Hi @Matpli , without knowledge of the file it's impossible to say. It literally can be anything. Unless you can share it, I'm afraid there is nothing we can do.

Matpli commented 2 years ago

Hello, Thanks for your answer, the file is an output from a reader. You can find the file enclosed

Le mar. 12 avr. 2022 à 19:19, Jan Marvin Garbuszus @.***> a écrit :

Hi @Matpli https://github.com/Matpli , without knowledge of the file it's impossible to say. It literally can be anything. Unless you can share it, I'm afraid there is nothing we can do.

— Reply to this email directly, view it on GitHub https://github.com/ycphs/openxlsx/issues/344#issuecomment-1096991278, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXWJ5NNDMG5HJOFXW4KALDVEWWDZANCNFSM5TH2DWFA . You are receiving this because you were mentioned.Message ID: @.***>

JanMarvin commented 2 years ago

Sorry but attaching it via mail does not work. You have to drag and drop it into the comment field of this issue to upload it via GitHub.

Matpli commented 2 years ago

2022-04-12-11-42-36-DP_Melanges1.xlsx

JanMarvin commented 2 years ago

Thanks for sharing the file. I've looked into it but have sad news. The issue is, that whatever wrote this file, was keeping it special. Every node in the entire workbook has a namespace attribute x: prepended. Perfectly valid xml, but we search for the strings without "x:" like <workbook ...>:

Your header:

<x:workbook xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:fileVersion appName="BioVendor Analytics"/>
<x:workbookPr date1904="0" showObjects="all" backupFile="0"/>

Vs our and other spreadsheet software (Excel, Libreoffice, Numbers etc.) headers:

<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" ...>
<fileVersion appName="xl" lastEdited="7" lowestEdited="7" rupBuild="11108"/>
<workbookPr defaultThemeVersion="166925"/>

openxlsx looks for "workbook" and cannot find it, because it is called "x:workbook".

Unfortunately there is no way for us to support both variants I'm afraid. You could clone the project and replace every xml node to x:, but that most likely will not be supported by us.

JanMarvin commented 2 years ago

That being said, there may be alternatives. One option would be to scan all xml/vml files for namespsaces and remove them. Although this is unlikely in the package (you are probably the first person to encounter such a problem). Instead, you could construct a local wrapper to use these files.

1) Unzip the xlsx file to any directory: unzip("broken_file.xlsx", exdir = "extraction_directory")
2) Change to this directory and apply gsub("<x:", "<", ...) to all xml files (readline(), gsub(), writeLine()).
3) Read the workbook with isUnzipped option when reading from the directory where you extracted the workbook. This way you can import the data into a workbook and play around with it in R, but writing it back to xlsx might create broken xlsx files (after all, there might be other side effects)

Matpli commented 2 years ago

Thanks for your explanations i will test the alternative with unzip xlsx file and replace <:x by <

JanMarvin commented 2 years ago

@Matpli I've pushed a pull request to openxlsx2 that is capable of reading your file. Though it is not for merging with the main branch and only to toy around. It can import and read your file and export it to xlsx too. The changes are minimal, but still large enough for me to hesitate merging them with our mainline

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 7 days with no activity.

ycphs / openxlsx

Error reading .xlsx Workbook has no worksheets #344