Open Boxylmer opened 2 years ago
Updates:
I rolled back to 0.7.10 and faced no issues. I'll have to stay here until there's some more knowledge about what's going on here. Could this be due to new versions in ZipFile? I didn't see any code that would cause breaking changes like this when moving from 0.7.x -> 0.8.x, but this transition is what causes the issue.
This is still present in 0.9.x unfortunately :(
I have this issue, too. Is there a fix, or do I have to roll back to 0.7.10 (and how do I do that, I wonder...)
ERROR: LoadError: AssertionError: Some internal files were not loaded into memory. Did you use
XLSX.open_xlsx_templateto open this file? Stacktrace: [1] writexlsx(output_source::String, xf::XLSX.XLSXFile; overwrite::Bool) @ XLSX C:\Users\TGebbels\.julia\packages\XLSX\U2Bcm\src\write.jl:64 [2] writexlsx(output_source::String, xf::XLSX.XLSXFile) @ XLSX C:\Users\TGebbels\.julia\packages\XLSX\U2Bcm\src\write.jl:60
Note: Can't roll back because I'm using Add keep_empty_rows kwarg
in 0.10.0
Did you use XLSX.open_xlsx_template to open this file?
Also, if my answer to this question is "yes", what does that imply?
I have this issue, too. Is there a fix, or do I have to roll back to 0.7.10 (and how do I do that, I wonder...)
ERROR: LoadError: AssertionError: Some internal files were not loaded into memory. Did you use
XLSX.open_xlsx_templateto open this file? Stacktrace: [1] writexlsx(output_source::String, xf::XLSX.XLSXFile; overwrite::Bool) @ XLSX C:\Users\TGebbels\.julia\packages\XLSX\U2Bcm\src\write.jl:64 [2] writexlsx(output_source::String, xf::XLSX.XLSXFile) @ XLSX C:\Users\TGebbels\.julia\packages\XLSX\U2Bcm\src\write.jl:60
Note: Can't roll back because I'm using
Add keep_empty_rows kwarg
in 0.10.0
Im still unsure. Personally I would roll back and edit your project to filter empty rows manually.
I managed to work around this for a while, but now it's got me again and now it's a bigger problem for me. I've done a little digging. I don't understand much, but here is what I've found:
ERROR: LoadError: AssertionError: Some internal files were not loaded into memory. Did you use `XLSX.open_xlsx_template` to open this file?
Stacktrace:
[1] writexlsx(output_source::String, xf::XLSX.XLSXFile; overwrite::Bool)
@ XLSX C:\Users\TGebbels\.julia\packages\XLSX\U2Bcm\src\write.jl:64
Using some different .xlsx files works just fine.
Line 64 in write.jl says
@assert all(values(xf.files)) "Some internal files were not loaded into memory. Did you use `XLSX.open_xlsx_template` to open this file?"
So I tried to see what was going on with
println(keys(template.files))
println(values(template.files))
which showed:
["xl/sharedStrings.xml", "xl/drawings/drawing1.xml", "customXml/item2.xml", "customXml/itemProps1.xml", "customXml/itemProps2.xml", "xl/worksheets/sheet1.xml", "xl/ctrlProps/ctrlProp1.xml", "_rels/.rels", "xl/workbook.xml", "xl/styles.xml", "[Content_Types].xml", "docProps/core.xml", "xl/_rels/workbook.xml.rels", "xl/worksheets/sheet2.xml", "customXml/_rels/item1.xml.rels", "xl/worksheets/_rels/sheet1.xml.rels", "customXml/item1.xml", "xl/theme/theme1.xml", "customXml/_rels/item2.xml.rels", "docProps/app.xml"]
Bool[1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1]
for a file that fails whereas, for a working file, it shows
["xl/sharedStrings.xml", "xl/drawings/drawing1.xml", "xl/worksheets/sheet1.xml", "xl/ctrlProps/ctrlProp1.xml", "_rels/.rels", "xl/workbook.xml", "xl/styles.xml", "[Content_Types].xml", "docProps/core.xml", "xl/_rels/workbook.xml.rels", "xl/worksheets/sheet2.xml", "xl/worksheets/_rels/sheet1.xml.rels", "xl/theme/theme1.xml", "docProps/app.xml"]
Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
So it looks like XLSX.jl doesn't correctly handle the customXml content.
I can manually delete the customXml folder if I open the xlsx file as a zip. If I do that, the file will work again with XLSX.jl.
["xl/sharedStrings.xml", "xl/drawings/drawing1.xml", "xl/worksheets/sheet1.xml", "xl/ctrlProps/ctrlProp1.xml", "_rels/.rels", "xl/workbook.xml", "xl/styles.xml", "[Content_Types].xml", "docProps/core.xml", "xl/_rels/workbook.xml.rels", "xl/worksheets/sheet2.xml", "xl/worksheets/_rels/sheet1.xml.rels", "xl/theme/theme1.xml", "docProps/app.xml"]
Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
However, it will no longer work with Excel which offers to try to repair the file. Once repaired by Excel, the file fails again but now with even more customXml elements!
["customXml/_rels/item3.xml.rels", "xl/sharedStrings.xml", "xl/drawings/drawing1.xml", "customXml/item2.xml", "customXml/itemProps1.xml", "customXml/item3.xml", "customXml/itemProps2.xml", "xl/worksheets/sheet1.xml", "xl/ctrlProps/ctrlProp1.xml", "_rels/.rels", "xl/workbook.xml", "xl/styles.xml", "[Content_Types].xml", "docProps/core.xml", "xl/_rels/workbook.xml.rels", "customXml/itemProps3.xml", "xl/worksheets/sheet2.xml", "customXml/_rels/item1.xml.rels", "xl/worksheets/_rels/sheet1.xml.rels", "customXml/item1.xml", "xl/theme/theme1.xml", "docProps/custom.xml", "customXml/_rels/item2.xml.rels", "docProps/app.xml"]
Bool[0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1]
I don't understand the internals of Excel and I don't understand how XLSX.jl works. However, I found the lines:
# ignore custom XML internal files
if startswith(f.name, "customXml")
continue
end
in the code to read the template file. Are you ignoring these elements on read but requiring them on write? It would be really great to see resolution of this issue.
Also see this thread on Discourse.
I’ve had a go at this with some success.
To start with, I edited read.jl
to remove the exclusion of the customXml
internal files from being processed by internal_xml_file_read
by commenting out the lines:
# ignore custom XML internal files
#if startswith(f.name, "customXml")
# continue
#end
This resulted in the following warnings:
┌ Warning: XMLError: xmlns: URI ed0bb2e3-83d5-46a8-9e95-92c1ca9cc599 is not absolute from XML Namespace module (code: 100, line: 1)
└ @ EzXML C:\Users\TGebbels\.julia\packages\EzXML\DL8na\src\error.jl:97
┌ Warning: XMLError: xmlns: URI 8fe8bdaf-491a-4d2e-89dd-756120b60898 is not absolute from XML Namespace module (code: 100, line: 1)
└ @ EzXML C:\Users\TGebbels\.julia\packages\EzXML\DL8na\src\error.jl:97
Besides these, the process seemed to work for my purposes. I don’t really know how significant these warnings are but I suppose Excel knows best what it puts in these elements of a .xlsx
file.
Second, I changed the initial condition so that the customXml
internal files were treated as binary files and simply “passed through” from read to write without being touched.
if !startswith(f.name, "customXml") && (endswith(f.name, ".xml") || endswith(f.name, ".rels"))
#if endswith(f.name, ".xml") || endswith(f.name, ".rels")
This also worked for my purpose and without producing any errors or warnings!
I ran tests on the revised code which gave the following results:
Testing Running tests...
Test Summary: | Pass Total Time
read test files | 23 23 3.2s
Test Summary: | Pass Total Time
Cell names | 326 326 0.0s
Test Summary: | Pass Total Time
getindex | 12 12 1.2s
Test Summary: | Pass Total Time
Time and DateTime | 6 6 0.1s
Test Summary: | Pass Total Time
number formats | 16 16 1.3s
Test Summary: | Pass Total Time
Defined Names | 24 24 0.5s
Test Summary: | Pass Total Time
Book1.xlsx | 32 32 0.3s
Test Summary: | Pass Total Time
book_1904_ptbr.xlsx | 9 9 0.3s
Test Summary: | Pass Total Time
numbers.xlsx | 75 75 0.0s
Test Summary: | Pass Total Time
Column Range | 9 9 0.1s
Test Summary: | Pass Total Time
CellRange iterator | 1 1 0.1s
Test Summary: | Pass Total Time
Table | 598 598 2.8s
Test Summary: | Pass Total Time
Helper functions | 32 32 0.5s
Test Summary: | Pass Total Time
Write | 73 73 0.3s
Test Summary: | Pass Total Time
Edit Template | 3 3 0.2s
Test Summary: | Pass Total Time
addsheet! | 10 10 0.1s
Test Summary: | Pass Total Time
Edit | 13 13 0.3s
Test Summary: | Pass Total Time
writetable | 183 183 1.3s
Test Summary: | Pass Total Time
Styles | 74 74 0.2s
Test Summary: | Pass Total Time
filemodes | 200 200 2.8s
Test Summary: | Pass Total Time
escape | 84 84 0.2s
Test Summary: | Pass Total Time
row_index | 1 1 0.1s
Test Summary: |Time
show xlsx | None 0.0s
Test Summary: | Pass Total Time
relative paths | 5 5 0.2s
Test Summary: | Pass Total Time
windows compatibility | 3 3 0.0s
Test Summary: | Pass Total Time
whitespace nodes | 8 8 0.0s
Test Summary: | Pass Total Time
inlineStr | 20 20 0.0s
Test Summary: | Pass Total Time
Tables.jl integration | 27 27 3.4s
Testing XLSX tests passed
I'm hoping this PR will be merged soon. I'm still having to use the dev'd version... 😃
When opening a file with
I get
Where the call is occurring during
writexlsx(filepath, xf, overwrite=true)
Originally thought to be a ram issue, I have this calling on hundreds of other excel sheets (some larger, some smaller) with no real telltale sign as to why one would give this error. What's going on here and how can I narrow down why this would happen in a particular file? Alternatively, what's another design pattern I can apply to large files I need to open and edit? I've tried
enable_cache=false
, but on rw it appears to be a requirement to also have it enabled. (https://github.com/felipenoris/XLSX.jl/blob/a9738321c945ad4f1a44c1c6406b260f04d411bb/src/read.jl#L184-187)