JuliaIO / LibExpat.jl

Julia interface to the Expat XML parser library
Other
9 stars 32 forks source link

xpath tests #8

Open amitmurthy opened 11 years ago

amitmurthy commented 11 years ago

add tests for xpath, possibly based on http://msdn.microsoft.com/en-us/library/ms256086.aspx

amitmurthy commented 11 years ago

cc: @vtjnash

Some not-so-trivial xpath queries are failing. Will be good to add some standard tests for the same.

vtjnash commented 11 years ago

Sounds good, will do.

Can you give some examples of xpath queries that are failing for you? There is a class of unimplemented functionality (e.g. parenthesized expressions, many functions, namespaces, returning objects of types other than nodes), but there is also likely still some legit bugs.

amitmurthy commented 11 years ago

From the above xml

julia> pd["@specialty"]
ERROR: no method convert(Type{ParsedData},ASCIIString)
 in push! at array.jl:663
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:798
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:516
 in getindex at /home/amitm/.julia/LibExpat/src/xpath.jl:898

julia> pd["/@specialty"]
0-element ParsedData Array

julia> pd["//@specialty"]
ERROR: no method convert(Type{ParsedData},ASCIIString)
 in push! at array.jl:663
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:798
 in xpath_descendant at /home/amitm/.julia/LibExpat/src/xpath.jl:891
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:847
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:818
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:516
 in getindex at /home/amitm/.julia/LibExpat/src/xpath.jl:898

julia> pd["/bookstore@specialty"]
0-element ParsedData Array

julia> pd["/bookstore/@specialty"]
ERROR: no method convert(Type{ParsedData},ASCIIString)
 in push! at array.jl:663
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:798
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:806
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:840
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:818
 in xpath at /home/amitm/.julia/LibExpat/src/xpath.jl:516
 in getindex at /home/amitm/.julia/LibExpat/src/xpath.jl:898

The below returns 2 objects, should return only one:

julia> pd["/bookstore/book[1]/author/award"]
2-element ParsedData Array:
 <award>Trenton Literary Review Honorable Mention</award>
 <award>Pulitzer</award> 

cannot get the text anyways:

julia> pd["/bookstore/book[1]/author/award/text()"]
0-element ParsedData Array

julia> 

I have not done extensive testing, was just thinking of using this interface and ran into problems.

vtjnash commented 11 years ago

Some of these are intentionally not supported currently (retrieving attributes via pd["@specialty"] or text text()), so there would only be one possible return type These ones show correct behavior (pd["/bookstore@specialty"] pd["/@specialty"])

This one seems wrong: pd["/bookstore/book[1]/author/award"]

amitmurthy commented 11 years ago

(pd["/bookstore@specialty"] pd["/@specialty"]) are returning an empty array.

amitmurthy commented 11 years ago

How does one get the text using xpath then? pd["/bookstore/magazine/price#string"] does not work either.

amitmurthy commented 11 years ago

It will also be good to list the xpath subset supported more explicitly in the README.

vtjnash commented 11 years ago

This one pd["/bookstore/book[1]/author/award"] succeeds for me:

julia> pd["/bookstore/book[1]/author/award"]
1-element ParsedData Array:
 <award>Trenton Literary Review Honorable Mention</award>

julia> pd["/bookstore/book[3]/author/award"]
1-element ParsedData Array:
 <award>Pulitzer</award>

I'll add an API to support the other node types.