Closed bngksgl closed 3 years ago
Hi @bngksgl I asked some people involved in @wikidata and it looks like there is an issue with the SPARQL endpoint. Hopefully it will be solved soon.
Hi thank you for the feedbcak @dayures . I just checked today as well and its still not working. Did you hear back from people in @Wikidata?
Hi @bngksgl I haven't received any feedback. I tested the query today and it returns a timeout error (not a 403).
BTW, could you double check your query? Maybe it could be re-organize in a way that it doesn't overload the server. Also, it appears "?item" in the query. Is that ok, or maybe is "?business" ?
For instance, this query (that is part of the query that you shared), already returns +11K results.
Maybe we need to set the UserAgent? It seems the sparqlwrapper UserAgent may be blocked on their end. https://lists.wikimedia.org/pipermail/wikidata/2019-July/013247.html
@chicocvenancio @ookgezellig thanks for the feedback! Do you know if it is possible to access to the black-listed user agents somehow?
@dayures thanks for your comment! I changed my IP address, and it seems to work now. I think the problem was due to excessive querying my IP adress was blocked from their servers. Therefore, I reworked the query to create less hustle for the servers.
1) If one cannot easily change IP address, do you know how long it takes for the IP to be removed from the blacklist? 2) How do you get onto this blacklist? Is it amount of requests, or amount of returned rows? Does anyone know?
You shouldn’t change IP address at all. Rate limits are per client, which is defined as IP address + user agent, so what you should do is set a good user agent in accordance with the User-Agent policy. The limits are also explained here – basically, you get 60 seconds of query runtime per 60 seconds of real time. (In other words, you can briefly run queries in parallel, but not continuously.) If you get an HTTP 429 Too Many Requests error from the server, stop sending queries altogether until the time specified in the Retry-After
response header; if you fail to do that, your client will be banned for 24 hours.
The amount of data returned has no impact, as far as I’m aware, at least as long as you don’t cause timeout errors. (If you do cause errors, there’s a limit of 30 errors per minute.) And I’m not aware of any ban longer than these 24 hours.
Edit: To set the User-Agent, pass the agent
parameter into the SPARQLWrapper
constructor, for example:
from SPARQLWrapper import SPARQLWrapper
wrapper = SPARQLWrapper('https://query.wikidata.org/sparql',
agent='example-UA (https://example.com/; mail@example.com)')
(Note that SPARQLWrapper2
is missing this parameter, see #162.)
Thanks for contributing to this issue. As it has been more than 90 days since the last activity, we are automatically closing the issue. This is often because the request was already solved in some way and it just wasn't updated or it's no longer applicable. If that's not the case, please do feel free to either reopen this issue or open a new one. We'll gladly take a look again!
Hi,
I am trying to use sparqlwrapper inside python to query wikidata. Last week my code was working without a problem, however today I am receiving 'HTTPError: HTTP Error 403: Forbidden' error: I also tried with using requests, i am still getting the same error. How can i overcome this issue? At below you may find the code i am using and the error.
Code: agent_={'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) ' 'AppleWebKit/537.11 (KHTML, like Gecko) ' 'Chrome/23.0.1271.64 Safari/537.11', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8', 'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3', 'Accept-Encoding': 'none', 'Accept-Language': 'en-US,en;q=0.8', 'Connection': 'keep-alive'}
sparql = SPARQLWrapper("https://query.wikidata.org/sparql",agent=agent_) sparql.setQuery("""SELECT * { SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } { SELECT ?business ?businessLabel ?altLabel WHERE {
an entity has a key ID and a usage count
} UNION { SELECT ?business ?businessLabel ?altLabel WHERE {
an entity has a key ID and a usage count
}
}""") sparql.setReturnFormat(JSON) data = sparql.query().convert()
Error HTTPError Traceback (most recent call last)