dgtlmoon / changedetection.io

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monitor which websites had a text change for free. Free Open source web page change detection, Website defacement monitoring, Price change notification
Apache License 2.0
15.85k stars 885 forks source link

when using Xpath selectors, loses the text encoding #2359

Closed rmichelena closed 1 month ago

rmichelena commented 1 month ago

I have some pages in which I'm now using Xpath selectors to extract: //tr[contains(@class, 'ui-datatable-even') or contains(@class, 'ui-datatable-odd')]/td[position() > 1 and not(position() > last() - 2)]

before I was using Xpath, the extracted text looked like this, either without filters or with CSS class selectors: SERVICIO DE SUPERVISIÓN

now it looks like this: SERVICIO DE SUPERVISIÓN

However, the HTML file does not seem to include a "meta" tag... so I guess there was some assumption being made by Changedetection when using no filters or CCS, which is not being made when using Xpath... seace busqueda.html.txt

dgtlmoon commented 1 month ago

https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help#xpath-and-non-latin-text-getting-garbled :)

Duplicate #1546

dgtlmoon commented 1 month ago

its actually because the xpath library we use doesnt support it.. but you can easily convert your xpath to CSS