magnusmanske / petscan_rs

The repo for the PetScan tool
https://petscan.wmflabs.org/
GNU General Public License v3.0
43 stars 10 forks source link

PetScan changes part of SPARQL query to HTML-coding when executing a run, making it impossible to make a rerun without repairing the query #137

Closed eksral closed 5 months ago

eksral commented 1 year ago

Steps to replicate the issue (include links if applicable):

Enter a SPARQL query containing some "special characters" like < (less than) and > (greater than) in the SPARQL box in the Other sources part of PetScan input panel. (See example below).

Run the PetScan by clicking the "Do it" button and enjoy the result.

Try to rerun the PetScan by clicking the "Do it" button again, possibly after having changed some parameters.

What happens?: The second run fails, giving 0 results, as PetScan, as a result of the first run, has replaced the "special characters" in the SPARQL query box by their respective HTML-coding, such as &lt;, which can not be handled by PetScan in subsequent runs.

What should have happened instead?: The first run should not modify the content of the SPARQL box making it unusable for subsequent runs.

Software version (skip for WMF-hosted wikis like Wikipedia): Current version of PetScan (where can I find the version number?)

Other information (browser name/version, screenshots, etc.): Latest web browser versions (Mozilla Firefox version 114.0.2 and Microsoft Edge version 114.0.1823.58)

Here is an example how the SPARQL box of the input panel has been corrupted after an otherwise successful run: https://petscan.wmflabs.org/?psid=25242042

MB-one commented 7 months ago

I have the same problem in non SPARQL queries if a special character is include in a category name:

If i use a category name including non-standard characters (e.g. ampersands), the character will be replaced by their respective html code (starting with an ampersand) upon running the query. If this isn't manually reverted in the "categories" field(s), the ampersand within the html code will again be replaced with "&" and the query will fail.

Is there any way to distinguish between an ampersand being part of the category name and an ampersand within html code?