fuddl / wd

a browser extension for wikidata
https://wikidata.org/wiki/Wikidata:Tools/Wikidata_for_Firefox
GNU General Public License v3.0
36 stars 6 forks source link

Could we have a way to debug when wd4w doesn't recognise a URL? #123

Open backache opened 1 year ago

backache commented 1 year ago

It is not recognising the URL's of ISNI ID's such as the following, despite it being described several different ways in Property:P213 https://isni.oclc.org/xslt/DB=1.2//CMD?ACT=SRCH&IKT=8006&TRM=ISN%3A0000%200001%201768%20497X&COOKIE=U51,KENDUSER,I28,B0028++++++,SY,NISNI,D1.2,Eb21597e5-24,A,H1,,3-28,,30-41,,43-59,,65-70,,74-75,R81.132.242.131,FY Given that according too regex101 the URL matches should work, it'd be nice to have away to understand/debug why wd4w isn't recognising them, or at least a list of things to check

fuddl commented 1 year ago

I could just log everything but I'm not sure if it would be helpful. There might be a problem with the replacement pattern containing spaces.

fuddl commented 1 year ago

This missunderstanding here is, that the resolver always matches against an unencoded url (` instead of%20,:instead of%3A`). And I couldn't figure out a convinient way to get the actual url you need to write a regular expression for. Maybe I should provide it? Let me think about it

fuddl commented 1 year ago

We should probably check all these patterns: https://w.wiki/6Me7

and these https://www.wikidata.org/wiki/Special:WhatLinksHere?target=Q108538446&namespace=120 💀

backache commented 1 year ago

the resolver always matches against an unencoded url ( instead of %20, : instead of %3A).

You're right, I have swapped out the encoded colon for \: and the encoded spaces for \s and it works

So we need to make that super clear in the documentation as your query shows I am not the first nor last to fall into that trap

Then we need to fix those URLs you found, I have tried fixing a couple and they are a bit of a nightmare