crawlserv / crawlservpp

crawlserv++: Application for crawling and analyzing textual content of websites.
Other
5 stars 0 forks source link

Server crashes on XPath query returning numerical result #164

Closed crawlserv closed 1 year ago

crawlserv commented 1 year ago

The server crashes with crawlserv: ./src/pugixml.cpp:8303: void pugi::impl::anonymous}::convert_number_to_mantissa_exponent(double, char (&)[32], char**, int*): Assertion `mantissa[0] != '0' && mantissa[1] == '.'' failed. after running any XPath query using "count()" and returning the result – e.g., count(//*) on any (non-empty) text.

Comparing the result with a number such as count(//*)>0, returning a boolean value, works fine.

crawlserv commented 1 year ago

– seems to happen due to an unrelated call to TidyCreate beforehand – platform-dependent – only if library was not compiled with NDEBUG set

crawlserv commented 1 year ago

bug in pugixml (see https://github.com/zeux/pugixml/issues/574) and in tidy-html5 before 5.7.18 (see https://github.com/htacg/tidy-html5/issues/770)

workaround: set language used by tidy manually to avoid bug in tidy-html5