Open ppalaga opened 11 months ago
Failing test merged; interesting that validation works for string
value but fails for other scalars (int, date-time).
And apparently only for writer-side. I wonder if typed write methods are missing some callbacks...
Oh wait. No, can't be since this is just read/write'ing content, not using typed API.
Hmmh. Not super easy to figure out. But the difference wrt xs:string
is in GenericMsvValidator.validateElementAndAttributes()
:
int stringChecks = mCurrAcceptor.getStringCareLevel();
where xs:string
gets Acceptor.STRING_IGNORE
(i.e. anything goes) but others Acceptor.STRING_STRICT
.
Ok, can't quite figure it out, but I assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).
At least there is reproduction if I have time to go back digging or (ideally) someone else has itch and does spelunking into code :) (would be happy to merge a PR for fix)
Thanks for having a look @cowtowncoder!
assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).
Something like that would also be my hypothesis: The code perhaps first checks whether the content is an int
before it checks whether xsi:nil
is true
? It does not matter for strings because those can have zero length.
Thanks for having a look @cowtowncoder!
assume sequence of validation calls in this case is incorrect for writer side; some call missing (or extra call).
Something like that would also be my hypothesis: The code perhaps first checks whether the content is an
int
before it checks whetherxsi:nil
istrue
? It does not matter for strings because those can have zero length.
All actual validation state handling is done by MSV; Woodstox side just calls various methods to "feed" content.
As per my other note one observable difference is that xs:string
can have exactly any content so MSV does not check anything.
I think I figured out, what is the problem.
I traced the method calls using AspectJ for both the reader and writer validation - here is the code: https://github.com/ppalaga/woodstox/commits/i179-trace/
To produce the traces you need to do the following:
mvn clean test -Dtest=TestW3CSchemaNillable179#testNillableIntReader > reader-trace.txt
mvn clean test -Dtest=TestW3CSchemaNillable179#testNillableInt > writer-trace.txt
I am attaching the traces:
In those the following parts are interesting:
In the reader scenario, reading the nil
attribute returns true
as expected:
> com.ctc.wstx.msv.GenericMsvValidator.validateElementStart("nillableInt":java.lang.String, "http://server.hw.demo/nillable":java.lang.String, "nl":java.lang.String)
...
> com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
> com.ctc.wstx.sr.InputElementStack.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
> com.ctc.wstx.sr.InputElementStack.findAttributeIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
> com.ctc.wstx.sr.AttributeCollector.findIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
> com.ctc.wstx.sr.Attribute.hasQName("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
< com.ctc.wstx.sr.Attribute.hasQName("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : true:boolean
< com.ctc.wstx.sr.AttributeCollector.findIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : 0:int
< com.ctc.wstx.sr.InputElementStack.findAttributeIndex("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : 0:int
> com.ctc.wstx.sr.InputElementStack.getAttributeValue(0:int)
> com.ctc.wstx.sr.InputElementStack.getAttrCollector()
< com.ctc.wstx.sr.InputElementStack.getAttrCollector() : com.ctc.wstx.sr.AttributeCollector
> com.ctc.wstx.sr.AttributeCollector.getValue(0:int)
> com.ctc.wstx.util.TextBuilder.getAllValues()
< com.ctc.wstx.util.TextBuilder.getAllValues() : "true":java.lang.String
> com.ctc.wstx.sr.Attribute.getValue("true":java.lang.String)
< com.ctc.wstx.sr.Attribute.getValue("true":java.lang.String) : "true":java.lang.String
but it is a bit different in the writer scenario:
> com.ctc.wstx.msv.GenericMsvValidator.validateElementStart("nillableInt":java.lang.String, "http://server.hw.demo/nillable":java.lang.String, "nl":java.lang.String)
...
> com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
> com.ctc.wstx.sw.BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
< com.ctc.wstx.sw.BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : null:java.lang.String
< com.ctc.wstx.msv.AttributeProxy.getValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String) : null:java.lang.String
I was wondering why BaseStreamWriter.getAttributeValue("http://www.w3.org/2001/XMLSchema-instance":java.lang.String, "nil":java.lang.String)
returns null
. Here is why:
It always returns null, because it is not implemented.
@cowtowncoder, could you please suggest some way forward?
Great detective work @ppalaga !
I wish I remembered the context here, but from what you include I think it is what it say: on writer-side, attribute info (wrt currently written element) is not accessible. Likely because it is not being retained. And if so, it'd need to be kept track of the case where there is validator to call (otherwise it is additional overhead with no benefit).
Thanks, sounds good. The main questions from my side are: (1) is it worth implementing (2) any chance that somebody from the project does that?
@ppalaga Right now I can't promise to work on this, but I always try to make time to help with PRs if someone else does. If that helps at all?
Fixing this would definitely be useful, for writer-side validation.
Thanks for the statement, @cowtowncoder.
Could you please give some hints about the design of writer validation? Especially, the following topics would be interesting:
InputElementStack
for the writer scenario as well?
- Do we need Typed and Validating subclasses of BaseStreamWriter as they exist for BasicStreamReader?
I see there is TypedStreamWriter, but where under its subclasses Simple/Repairing should the new Validating writer be hanged?
No, should not need new sub-classes as typed writes and pluggable validation already exist on writer side.
Validating reader tests are in packages ending with .vstream
(v
for validating), split in 3 ones I think:
but there are also Validating Write tests in:
One confusing aspect of naming is that there are, I think, 2 hierarchies of -Writer
s:
-StreamWriter
s extending Stax XMLStreamWriter
-Writer
s with XMLWriter
, used by stream writers for actual content generationValidation seems to be mostly (but maybe not completely) handled by StreamWriter implementations; actual validation is mValidator
, and it is being called by stream writer based on write calls to essentially maintain and check validator state in parallel with actual token stream writes.
So: I do not think new classes are needed; most other validation functionality exists. But I do not remember what my thinking was with missing state keeping for attributes -- or, possibly just wiring access to state that might be kept. It is possible state is not kept, however, since whereas on StreamReader side there is natural need to access attribute values, on writer side there isn't (they are just to be output and XMLStreamWriter has no accessors).
This issue occurs when CXF is validating an incoming SOAP message against W3CSchema and the message contains a an
xs:dateTime
orxs:int
elements havingxsi:nil="true"
:CXF calls
validateAgainst(schema)
on a writer which gets the events copied from the reader.A plain Woodstox reproducer is available in PR https://github.com/FasterXML/woodstox/pull/180 . It also makes sure that the failing XML documents are accepted by
javax.xml.validation
.