fuddl / wd

a browser extension for wikidata
https://wikidata.org/wiki/Wikidata:Tools/Wikidata_for_Firefox
GNU General Public License v3.0
36 stars 6 forks source link

Website resolver usually fails #86

Closed fuddl closed 1 year ago

fuddl commented 2 years ago

https://www.saint-ouen.fr/ should resolve to https://www.wikidata.org/wiki/Q208889 but doesn't

from #84

fuddl commented 2 years ago

@teolemon in this particular scenario it fails because the Q208889 had the url with http while the website used https 😅 Screen Shot 2022-06-19 at 17 26 04

teolemon commented 2 years ago

Yes that happens with so many stored urls, probably worthwhile handling it in the code (eg stripping prefixes if it's just for detection ?)

teolemon commented 2 years ago

Ah you created an issue for it. Sorry, I should have explicitly pointed it in the other issue

teolemon commented 2 years ago

Sorry if I lost you time scratching your head

fuddl commented 2 years ago

nah, it's fine 🤷

the query alpready looks like this, we could add another dimension.

SELECT ?item {
  {
    ?item wdt:P953 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P973 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P856 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P2699 <https://www.saint-ouen.fr/>.
  } UNION {
    ?item wdt:P953 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P973 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P856 <https://www.saint-ouen.fr>.
  } UNION {
    ?item wdt:P2699 <https://www.saint-ouen.fr>.
  }
}

don't know if it is possible to have wildcards in urls though

derenrich commented 1 year ago

I just ran into this when playing with the extension. https://www.hansonrobotics.com/ should resolve to https://www.wikidata.org/wiki/Q48999902

I think you can make this work without using SPARQL maybe with wbsearchentities which will more flexibly match

fuddl commented 1 year ago

I just ran into this when playing with the extension. https://www.hansonrobotics.com/ should resolve to https://www.wikidata.org/wiki/Q48999902

right now it does. One issue is that the extension doesn't remember the edit it made right away. The resolver only kicks in when the Sparql api gives the correct answer. This should be easy to fix thougt. (by adding it to the internal cache)

I think you can make this work without using SPARQL maybe with wbsearchentities which will more flexibly match

Can you show me an example how to do that? Or is it documented somewhere?

derenrich commented 1 year ago

Yeah I figured there was a caching issue with that edit but it should've worked before the edit by fuzzy matching on https (as you described above).

API docs are at https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities

fuddl commented 1 year ago

Ill try to fix the cache first

fuddl commented 1 year ago

in version .273 the resolver should be a little more reliable.

@teolemon @derenrich You can try this by adding 'official website' or 'described by url' to an existing item and then reload or change tabs. The extension should now be able to find the previously connected item instantly. I also fixed a scenario where the website does not end on a /.

Ill close this ticket now, since unfortunatly it is not very well defined. Feel free to open a new one if issues occour 🙏. Please try to describe what behaviour you expect and what is happening instead. The url resolver is whacky by nature, a url is not an id 😭