S1M0N38 / soccerapi

soccerapi, an unambitious soccer odds scraper ⚽️
MIT License
161 stars 36 forks source link

Kambi Varnish Service Blocking IP Addresses #34

Open N-Vlahovic opened 3 years ago

N-Vlahovic commented 3 years ago

Hi all,

I am experiencing the following error

from soccerapi.api import Api888Sport
api = Api888Sport()
 url = 'https://www.888sport.com/#/filter/football/italy/serie_a'
odds = api.odds(url)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/soccerapi/api/base.py", line 28, in odds
    odds_to_parse = self.requests(self.url_to_competition(url))
  File "/usr/local/lib/python3.9/site-packages/soccerapi/api/888sport.py", line 40, in requests
    'full_time_result': self._request(competition, 12579),
  File "/usr/local/lib/python3.9/site-packages/soccerapi/api/888sport.py", line 65, in _request
    return self.session.get(url, params=params).json()
  File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 900, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I am pretty sure this is due to Kambi or Varnish blocking IP addresses after a while.

Here is an even simpler example highlighting what happens:

import requests
from typing import Dict

url: str = 'https://eu-offering.kambicdn.org/offering/v2018/888de/listView/football/germany/bundesliga.json?lang=de_DE&market=DE&client_id=2&channel_id=1&ncid=1617390885604&useCombined=true'
headers: Dict = {'Host': 'eu-offering.kambicdn.org', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0', 'Accept': 'application/json, text/javascript, */*; q=0.01', 'Accept-Encoding': 'gzip, deflate, br', 'Referer': 'https://www.888sport.com/de/fussball/', 'Origin': 'https://www.888sport.com', 'Connection': 'keep-alive'}
timeout: int = 15

response = requests.get(url, headers=headers, timeout=timeout)
print(response.text)
<!DOCTYPE html>
<html>
  <head>
    <title>410 Gone.</title>
  </head>
  <body>
    <h1>Error 410 Gone.</h1>
    <p>Gone.</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 839526155</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>

The error occurs on a server located in DE. I can confirm that other servers (also in DE) which have not been used to fetched Kambi data do not have this issue. So this leads me to believe that it must be due to some sort of restrictions/blocking.

Anyone has any ideas on how to tackle this issue. Only thing I can think of are: Proxy, IP rotating or other such measures. Maybe someone has a better idea.

artesea commented 3 years ago

The IP address will be blacklisted. Not sure if they have a master list which you happen to be appearing from (most Azure IPs fail to work) or that you've hit a limit and been added to it. Only option is to use a different IP. Never seen one released.