senjuhashirama / pugixml

Automatically exported from code.google.com/p/pugixml
0 stars 0 forks source link

Selecting a node at a particular position fails #232

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm not sure I'm misusing the library, but I've looked in the documentation (in 
the limitations section as well) and I can't see an example of how to do this. 
The results are suspicious as well.

What steps will reproduce the problem?
Write an xpath expression to select a node at a particular position, for 
example I'm using:
//aside[@id='aside']/div/a/img[1]/@src
or
//descendant::aside[@id='aside']/child::div/child::a/child::img[position()=2]/@s
rc

(note that the html has been turned into a valid xml file through another 
library)

What is the expected output? What do you see instead?
When I specify a position either through the syntax [position()=2] or the 
abbreviated syntax [2], I'd expect to see just the second match, if available.
Instead, when passing [position()=1] or [1], I get all of the matches (4 of 
them in my test), as if no position() was specified. When passing any other 
number ([position()=2], [0], [8], or [3], for example), I get no results at all.

Which version of pugixml are you using? On what operating system/compiler?
Using pugixml 1.4 on Sabayon Linux 64 bit, and gcc 4.9 compiler. Equo's output:
$ equo search pugixml
╠  @@ Cercando...
╠      @@ Pacchetto: dev-libs/pugixml-1.4 branch: 5, [sabayon-weekly] 
╠          Disponibile:   versione: 1.4 ~ tag: NoTag ~ revisione: 0
╠          Installato:    versione: 1.4 ~ tag: NoTag ~ revisione: 0
╠          Slot:          0
╠          Homepage:      http://pugixml.org/ 
╠                         https://github.com/zeux/pugixml/ 
╠          Descrizione:   Light-weight, simple, and fast 
╠                         XML parser for C++ with XPath support 
╠          Licenza:       MIT
╠   Keyword:  pugixml
╠   Trovati:  1 voce

Please provide any additional information below.
Code snippet:
{
    pugi::xml_document doc;
    std::istringstream iss(cleanHtml);
    pugi::xml_parse_result result(doc.load(iss));
    if (not result) {
        std::cerr << "Error parsing the source XML";
        return 1;
    }

    pugi::xpath_node_set xpathRes = doc.select_nodes(xpath);
    for (pugi::xpath_node_set::const_iterator itFind(xpathRes.begin()), itFindEND(xpathRes.end()); itFind != itFindEND; ++itFind) {
        const pugi::xpath_node& node = *itFind;
        if (node.node()) {
            std::cout << node.node().name() << ": " << node.node().value() << "\n";
        }
        else if (node.attribute()) {
            std::cout << node.attribute().name() << ": " << node.attribute().value() << "\n";
        }
    }
}

Original issue reported on code.google.com by de...@gmx.it on 7 Jun 2014 at 8:06

GoogleCodeExporter commented 9 years ago
Can you attach an example XML that you're querying?

Original comment by arseny.k...@gmail.com on 7 Jun 2014 at 8:08

GoogleCodeExporter commented 9 years ago
That's the cleaned up html. Also feel free to use the html from the link in 
there, it's my own server. You can also download the full code of the program 
I'm writing, you will find it on bitbucket under the name duckscraper.

Original comment by de...@gmx.it on 7 Jun 2014 at 11:27

Attachments:

GoogleCodeExporter commented 9 years ago
The query result should be a node set as defined by XPath specification.
You can use a different XPath evaluator, i.e. this one (powered by Java API for 
XML processing, so I'd expect it to be compliant as well): 
http://www.utilities-online.info/xpath/?save=113e9280-3f9b-4bd7-8603-86ff9c745ef
d-xpath#.U5Pgi5RdVMs

The reason is that the position filter applies to a node position within the 
child::img axis - i.e. your query says "get the src attribute of the first 
<img> child of <a>, for *all* occurences of //aside[@id='aside']/div/a).

What you want is a query like this:

(//descendant::aside[@id='aside']/child::div/child::a/child::img)[position()=2]/
@src
(//aside[@id='aside']/div/a/img)[2]/@src

(note the parens)

Original comment by arseny.k...@gmail.com on 8 Jun 2014 at 4:06

GoogleCodeExporter commented 9 years ago
> The query result should be a node set as defined by XPath specification.
Err, I meant "a node set of size 4".

Original comment by arseny.k...@gmail.com on 8 Jun 2014 at 4:07

GoogleCodeExporter commented 9 years ago

Original comment by arseny.k...@gmail.com on 1 Jul 2014 at 2:52

GoogleCodeExporter commented 9 years ago
Hi , 

I want to know how to write and read xml file using pugixml library.

any body please let me know. with sample code.
regards
MRK 

Original comment by mrk...@gmail.com on 15 Sep 2014 at 11:14

GoogleCodeExporter commented 9 years ago
Hi, 
i want know how to read and write a xml file and how much time will take means 
time   calculates  using pugixml library. any xml will take for read and write.
regards
MRk

Original comment by mrk...@gmail.com on 15 Sep 2014 at 11:21