Closed Themanwithoutaplan closed 9 years ago
Sorry, I'm too stupid to work out how to add a file. You can get schemas from the specification you need Part 1 from http://www.ecma-international.org/publications/standards/Ecma-376.htm
See also https://gist.github.com/Themanwithoutaplan/af3861c4a1c76a04854a
So that wasn't a bug after all. You need to pass a namespace-to-filename map to parse_schema_file for it to import schemas.
Here's what I did to your gist: https://gist.github.com/plq/202269b57bae168d9563
I'd think twice before clicking on it though -- clone it instead from https://gist.github.com/202269b57bae168d9563.git There's a parse.py in there that shows how it should work.
Currently it chokes on an attribute definition. I'm looking into it.
That's fixed as of 32579ee567e32dae7510b0bdfc5531f70080e2ed. Please try again with a full namespace map.
Thanks, but even after completing the the namespace:file map I'm getting some errors:
DEBUG:spyne.interface.xml_schema.parser:%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Traceback (most recent call last):
File "parse.py", line 24, in <module>
parsed_schema = parse_schema_file('sml.xsd', files=ns_file_map)['some_ns']
File "/Users/charlieclark/temp/spyne/spyne/spyne/util/xml.py", line 173, in parse_schema_file
.parse_schema(elt)
File "/Users/charlieclark/temp/spyne/spyne/spyne/interface/xml_schema/parser.py", line 593, in parse_schema
c.print_pending(fail=True)
File "/Users/charlieclark/temp/spyne/spyne/spyne/interface/xml_schema/parser.py", line 513, in print_pending
raise Exception("there are still unresolved elements")
Exception: there are still unresolved elements
From what I can see the output looks pretty interesting though I still need to work out how I'd use a different class design for some of the stuff I'm hoping to do. But anything is better than reinventing the wheel!
can you commit that so I can have a look at it?
or paste here, I dunno
ns_file_map = {
'http://purl.oclc.org/ooxml/officeDocument/relationships': 'shared-relationshipReference.xsd',
'http://purl.oclc.org/ooxml/officeDocument/sharedTypes':
'shared-commonSimpleTypes.xsd',
'http://purl.oclc.org/ooxml/drawingml/spreadsheetDrawing':
'dml-spreadsheetDrawing.xsd',
'http://purl.oclc.org/ooxml/drawingml/main':
'dml-main.xsd',
'http://purl.oclc.org/ooxml/drawingml/diagram':
'dml-diagram.xsd',
'http://purl.oclc.org/ooxml/drawingml/chart':
'dml-chartDrawing.xsd',
'http://purl.oclc.org/ooxml/drawingml/picture':
'dml-picture.xsd',
'http://purl.oclc.org/ooxml/drawingml/lockedCanvas':
'dml-lockedCanvas.xsd',
}
if you check the logs, you'll see that that's because not all of simpleType is implemented. I've just implemented <xs:list>
. <xs:union>
remains. I'm not sure when I can do that.
Just parsing it is not enough, you must implement serialization and deserialization in protocol/xml.py as well, because otherwise defaults can't be read.
see: 4e37d9fc7c49795e6134246caec1b88bc2551b89
Please have a look at the parser.py and the code itself. The error you're getting is not fatal, it just informs you that spyne had to drop some types. So if you pass force_full_parse=False to parse_schema_*
you will have access to what's already parsed.
I filed #422 and #423 as next steps to this issue. Patches are welcome.
Thanks very much for the information and the tips. Will look at the code when I've some time and may even submit some patches, if someone will hold my hand while I use git!
I'm currently interested in the "shape" of the Python classes generated in terms of the API they provide for developers. It would be great if I could dump my own hand-rolled code in favour of your more extensive and reliable but I'll still want to change what get's generated. I'll provide more information on the follow up issues but thank you again for your help so far. It's very much appreciated.
Glad to be of help.
If the schemas are set in stone, I see no harm in modifying the generated code. The generator is also at a very nascent stage (just 100 lines at this point) so it's OK to shape it to your needs.
Sorry, this is probably down to my lack of knowledge about git but I did a pull (you don't need an update like hg, right?) and then I got
from spyne import BODY_STYLE_BARE, BODY_STYLE_WRAPPED, BODY_STYLE_EMPTY
ImportError: cannot import name BODY_STYLE_BARE
wot. cd ..; rm -rf spyne; git clone git://github.com/arskom/spyne
Weird, I'd cloned it only last night. Seems to be running now with the force_full_parse
option (skip_errors
might be a better name for this option). I then got a key error due to "some_ns" – presumably I just check the keys of the returned schema? And then I get start looking for individual types.
yes, you need to put the targetNamespace of the schema you want there.
1274a46372a778b8c8812f49cde7dd9290ae9dd5
Okay, got that far myself. What do I need to do to get the class "definition"? I thought I saw a method for that somewhere.
I'm just playing around at the moment but if I get once of the generated classes it doesn't seem be enforcing any of the constraints.
To get an idea of what I'm looking to do you might to look at one of the classes I've created in openpyxl. Starting initially with descriptors to enforce constraints I've added some stuff to the base class and metaclass.
This is an example of a terribly designed bit of the spec with unnecessary nesting of elements instead of attributes, downright cryptic names because of abbreviation and general nastiness if you want to interact with it as a programmer.
As I hope you can see from the create
and serialise
methods we're both working along very similar lines.
Okay, got that far myself. What do I need to do to get the class "definition"? I thought I saw a method for that somewhere.
It's there in the xml example I sent you earlier.
I'm just playing around at the moment but if I get once of the generated classes it doesn't seem be enforcing any of the constraints.
lxml enforces the contstraints, spyne only generates the schema (and once validated, deserializes the document) . have a look at xml protocol's validate_lxml function and XmlSchema class in spyne.interface.
As I hope you can see from the create and serialise methods we're both working along very similar lines.
I disagree. You're doing everything manually :)
haha, not any more. I've started working with the first generated code to test some of the ideas – nesting works quite nicely. We can't wait for validation at serialisation time and lxml is also not a hard dependency in the project, so a user needs a TypeError
when creating an instance manually – this is what the library is for.
Some of the silliness in the current code (I deliberately showed one example which has weird behaviour) is to keep generated XML similar to what other programs do.
Ah, now I understand what you want.
You only need to write your version of quick and dirty genpy.py to generate your type of class definitions from schema data. Then you can forego both lxml and spyne as a dependency.
Unless, of course, you don't want to implement validation-on-assignment for Spyne. It's something I wanted to look at for some time. Then your efforts would be useful to a broader audience outside of openpyxl. Your choice, of course.
Basically, yes. lxml is a test requirement and I'd have no trouble making Spyne an optional one for development, especially with the way it handles all the relevant schema. So a command line might be something like python classify.py CT_AreaSer > AreaSer.py
which would generate the whole caboodle of relevant imports and classes in order for this particularly nasty bit of a particular nasty bit of the schema, that itself is peculiarly nasty.
Happy to contribute to the project where I can because I'm sure others will find it useful but I'm still finding my feet in it. Meeting Eric (the other project maintainer) at FOSDEM this weekend and hope to be able to discuss it with him. Got to be an improvement on the largely procedural code we inherited from the initial port from PHP which spreads parser, API and writer code liberally across the project.
The descriptors we use to implement typing should be reusable in any project (based on a cookbook recipe). See https://bitbucket.org/openpyxl/openpyxl/src/6b884f3f47f66358aa5c86f0e4fb6afabfb70c60/openpyxl/descriptors/base.py?at=2.2
We make types first level objects because of the convenience when coding. From what I've seen of your code you stay closer to the XML but expected_type=…
should be usable. Let me know if that would be useful and I can look at integrating it.
When it comes to generating code we have an additional flag for nested elements. These are child elements in the schema that can almost always be better represented as attributes. I guess I can just take gen_py as a base and swap out the base class.
Spyne's got its own version of expected_type
, namely ModelBase.Value
.
These are not enforced at all, though. I'd be happy to make them an optional part of Spyne. I guess that'd change validation code dramatically, but I'm not afraid :)
What do you think about Python 3? I no longer accept Python 3-incompatible code, and the number of tests that fail under Python 3 are supposed to be declining. Spyne's xml parts already work in Python 3 and I wouldn't want this to change as we already advertise it.
We support 2.6, 2.7, 3.3 and 3.4. Python 3 syntax is the standard and the compatibility imports are minimal. Apparently, io.BytesIO is slow on Python 2.6 because it uses StringIO and not cStringIO in the background but apart from that I've not heard of any real problems. Things get hairier if you want to keep support for 2.5 and earlier. 3.2 is just a pain because of the lack of support for the unicode literal which means you will get confusing failures.
I haven't worked out how your nested classes work but the type hierarchy is basically the same. We use __set__()
for validation on assignment where you have staticmethods.
Any nesting is done via ComplexModelBase
.
You'll probably have to come up with a ComplexModelValidatedOnAssignmentBase
that has the necessary descriptor setup to validate values. From there you'll hookup to the usual spyne machinery and use genpy.py to your heart's content
btw, re: Python 3, I'm also supporting 2.6, 2.7 and 3.3+
Doing my own version of ComplexModelBase
looks like it would be a good topic for a sprint! Are you going to be at PyCon?
There's now an initial implementation for validation-on-assignment. You need to pass voa
to the type customization line.
Thanks for the update. With now over 200 classes based on my own metaclass I won't be switching to Spyne for that but I might revisit the generator. Gave mention of Spyne during my talk at PyCon France.
The Office OpenXML schemas are spread out across multiple files.
parse_schema_file
seems to struggle with the various namespaces in use. It also struggles with the encoding declaration of the file which is weird, because lxml doesn't when I read it. I wonder if that's because it's usingfromstring(file.read())
rather than parse(file, parser)?