Closed djui closed 7 years ago
This should now be resolved on the master branch.
https://github.com/tealeg/xlsx/commit/7ffa59c7ac2676c3bff57e8cef15c18391545b97 does not fix the problem. Instead of removing the style, it sets it (from style=""
) to style="none"
. Neither value are valid. One example:
❯ python3 -c "from openpyxl import load_workbook; wb2 = load_workbook('output-master.xlsx')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 203, in load_workbook
parsed_styles = read_style_table(archive)
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 181, in read_style_table
p.parse()
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 50, in parse
self.border_list = IndexedList(self.parse_borders())
File "/usr/local/lib/python3.5/site-packages/openpyxl/utils/indexed_list.py", line 17, in __init__
for idx, val in enumerate(iterable):
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 99, in parse_borders
yield Border.from_tree(border_node)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 65, in from_tree
obj = desc.expected_type.from_tree(el)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 78, in from_tree
return cls(**attrib)
File "/usr/local/lib/python3.5/site-packages/openpyxl/styles/borders.py", line 46, in __init__
self.style = style
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 143, in __set__
super(NoneSet, self).__set__(instance, value)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 128, in __set__
raise ValueError(self.__doc__)
ValueError: Value must be one of {'thick', 'hair', 'mediumDashDotDot', 'thin', 'dashDot', 'dashDotDot', 'dashed', 'mediumDashDot', 'medium', 'double', 'dotted', 'mediumDashed', 'slantDashDot'}
Some more context and examples:
053c240.xlsx 27f0f70.xlsx 7ffa59c.xlsx
$ python3 -c "from openpyxl import load_workbook; wb2 = load_workbook('053c240.xlsx')"
$ python3 -c "from openpyxl import load_workbook; wb2 = load_workbook('27f0f70.xlsx')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 203, in load_workbook
parsed_styles = read_style_table(archive)
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 181, in read_style_table
p.parse()
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 50, in parse
self.border_list = IndexedList(self.parse_borders())
File "/usr/local/lib/python3.5/site-packages/openpyxl/utils/indexed_list.py", line 17, in __init__
for idx, val in enumerate(iterable):
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 99, in parse_borders
yield Border.from_tree(border_node)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 65, in from_tree
obj = desc.expected_type.from_tree(el)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 78, in from_tree
return cls(**attrib)
File "/usr/local/lib/python3.5/site-packages/openpyxl/styles/borders.py", line 46, in __init__
self.style = style
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 143, in __set__
super(NoneSet, self).__set__(instance, value)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 128, in __set__
raise ValueError(self.__doc__)
ValueError: Value must be one of {'medium', 'dotted', 'mediumDashed', 'dashDotDot', 'hair', 'thin', 'double', 'dashed', 'mediumDashDot', 'slantDashDot', 'dashDot', 'thick', 'mediumDashDotDot'}
$ python3 -c "from openpyxl import load_workbook; wb2 = load_workbook('7ffa59c.xlsx')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 203, in load_workbook
parsed_styles = read_style_table(archive)
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 181, in read_style_table
p.parse()
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 50, in parse
self.border_list = IndexedList(self.parse_borders())
File "/usr/local/lib/python3.5/site-packages/openpyxl/utils/indexed_list.py", line 17, in __init__
for idx, val in enumerate(iterable):
File "/usr/local/lib/python3.5/site-packages/openpyxl/reader/style.py", line 99, in parse_borders
yield Border.from_tree(border_node)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 65, in from_tree
obj = desc.expected_type.from_tree(el)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/serialisable.py", line 78, in from_tree
return cls(**attrib)
File "/usr/local/lib/python3.5/site-packages/openpyxl/styles/borders.py", line 46, in __init__
self.style = style
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 143, in __set__
super(NoneSet, self).__set__(instance, value)
File "/usr/local/lib/python3.5/site-packages/openpyxl/descriptors/base.py", line 128, in __set__
raise ValueError(self.__doc__)
ValueError: Value must be one of {'slantDashDot', 'thick', 'thin', 'dotted', 'double', 'mediumDashed', 'dashDotDot', 'hair', 'dashed', 'medium', 'dashDot', 'mediumDashDotDot', 'mediumDashDot'}
@djui the ECMA-376 standard for Office Open XML (and the XML stylesheets provided with it) define ST_BorderStyle with "none" as a valid value. That makes me think this must be a bug in openpyxl.
@pwaller good stuff!
@tealeg openpyxl just replied and refused to change it because apparently they interpret the spec to mean "="none"
implies 'missing'". Any chance we can fix it in light of this fact? I've asked if they can reconsider and point at the spec, but unless one side relents then these two nice pieces of software won't be able to interoperate.
Blame the specification for unnecessary ambiguity. Support for this wart requires special casing which adds nothing to the result. No attribute should be serialised as "none" unless this is required (makes no sense for border sides styles).
@tealeg openpyxl appears unwilling fix the problem their side and says they have filed issues with the spec. Any chance of revisiting it on your side in the mean time? Otherwise we can't produce files that people are consuming using openpyxl.
@pwaller I'll reopen the issue. It's somewhat trickier than it sounds to fix it, but maybe someone is up for a challenge.
Thanks @tealeg!
@tealeg looking into it briefly myself, do you have any idea why omitempty
does not do what I thought it would?
To answer my own question, it feels a bit like it must be a bug in encoding/xml where attr
and omitempty
don't play well together.
Apparently MarshalerAttr
sounds like it enables serialising with the attribute removed:
If MarshalXMLAttr returns the zero attribute Attr{}, no attribute will be generated in the output. MarshalXMLAttr is used only for struct fields with the "attr" option in the field tag.
I guess the fix would then be to make the Style
value a type alias of string
which implements MarshalXMLAttr
and returns xml.Attr{}
for the empty string. Does that sound right to you?
Oh, it also sounds like instead the field could be made into a pointer.
I guess both of those approaches may have problems with preserving backwards compatibility.
@pwaller I've tried the pointer approach - it works but you'll trigger a lot of changes elsewhere in the code. There are a number of small issues like this resulting from encoding/xml. Custom marshalling and unmarshalling using the encoding/xml library is the best approach, but also lengthy to achieve - again, I've looked at that for other reasons, it's a pain to start, but would be well worth it in the long run.
https://github.com/tealeg/xlsx/commit/27f0f707d42799573616e9166c4be707dd4d2621 introduced a fix for merged cells AFAIK but introduced a regression which created borders styles with attribute empty string, which is not definition compliant. Value must be one of
{'double', 'mediumDashDot', 'mediumDashDotDot', 'hair', 'dashDot', 'slantDashDot', 'medium', 'thick', 'mediumDashed', 'dashDotDot', 'thin', 'dotted', 'dashed'}