meraki-analytics / cassiopeia

An all-inclusive Python framework for the Riot Games League of Legends API. Cass focuses on making the data easy and fun to work with, while providing all the tools necessary to create a website or do data analysis.
MIT License
553 stars 134 forks source link

Concurrent .load() sometimes throws Pycurl error 6 - Could not resolve host: na1.api.riotgames.com #219

Closed tyuo9980 closed 6 years ago

tyuo9980 commented 6 years ago
matchlist = []
    for m_id in batch:
        match_id = int(m_id)
        matchlist.append(cass.get_match(id=match_id, region=region))

    pool = Pool(len(matchlist))
    pool.map(load_match, matchlist)
    pool.close()
    pool.join()

def load_match(match):
    match.load()

I'm batch loading matches by creating a threadpool and passing Matches into a function to call load() on them. This doesn't happen all the time, only occasionally.

Not really sure what is going on here. Could be a pycurl issue or maybe a race with cass?

Edit: seems to happen when too many calls are made at the same time, but why would pycurl throw the error?

[2018-03-05 00:11:52,744: ERROR/ForkPoolWorker-4] Task api.aggregator.tasks.aggregate_batched_matches[33434bf3-2dd4-4667-918d-f8d6421a4229] raised unexpected: error(6, 'Could not resolve host: na1.api.riotgames.com')
19:11:52 worker.1   |  Traceback (most recent call last):
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/celery/app/trace.py", line 374, in trace_task
19:11:52 worker.1   |      R = retval = fun(*args, **kwargs)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/celery/app/trace.py", line 629, in __protected_call__
19:11:52 worker.1   |      return self.run(*args, **kwargs)
19:11:52 worker.1   |    File "/Users/peterli/Projects/timewindergg/Rewind/api/aggregator/tasks.py", line 73, in aggregate_batched_matches
19:11:52 worker.1   |      pool.map(load_match, matchlist)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/multiprocessing/pool.py", line 266, in map
19:11:52 worker.1   |      return self._map_async(func, iterable, mapstar, chunksize).get()
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/multiprocessing/pool.py", line 644, in get
19:11:52 worker.1   |      raise self._value
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/multiprocessing/pool.py", line 119, in worker
19:11:52 worker.1   |      result = (True, func(*args, **kwds))
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
19:11:52 worker.1   |      return list(map(*args))
19:11:52 worker.1   |    File "/Users/peterli/Projects/timewindergg/Rewind/api/aggregator/tasks.py", line 90, in load_match
19:11:52 worker.1   |      match.load()
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/core/common.py", line 259, in load
19:11:52 worker.1   |      self.__load__()
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/core/common.py", line 270, in __load__
19:11:52 worker.1   |      self.__load__(group)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/core/common.py", line 277, in __load__
19:11:52 worker.1   |      data = configuration.settings.pipeline.get(type=self._load_types[load_group], query=query)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/datapipelines/pipelines.py", line 459, in get
19:11:52 worker.1   |      return handler.get(query, context)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/datapipelines/pipelines.py", line 185, in get
19:11:52 worker.1   |      result = self._source.get(self._source_type, deepcopy(query), context)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/datapipelines/sources.py", line 120, in get
19:11:52 worker.1   |      return source.get(type, deepcopy(query), context)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/datapipelines/sources.py", line 69, in wrapper
19:11:52 worker.1   |      return call(self, query, context=context)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/datapipelines/queries.py", line 323, in wrapped
19:11:52 worker.1   |      return method(self, query, context)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/riotapi/match.py", line 39, in get_match
19:11:52 worker.1   |      data = self._get(url, {}, self._get_rate_limiter(query["platform"], "matches/id"))
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/riotapi/common.py", line 214, in _get
19:11:52 worker.1   |      return request()
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/riotapi/common.py", line 257, in __call__
19:11:52 worker.1   |      connection=self.connection)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/common.py", line 112, in get
19:11:52 worker.1   |      status_code, body, response_headers = HTTPClient._get(url, headers, rate_limiters, connection)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/common.py", line 88, in _get
19:11:52 worker.1   |      status_code = HTTPClient._execute(curl, connection is None)
19:11:52 worker.1   |    File "/Users/peterli/miniconda3/envs/timewinder/lib/python3.6/site-packages/cassiopeia/datastores/common.py", line 36, in _execute
19:11:52 worker.1   |      curl.perform()
19:11:52 worker.1   |  pycurl.error: (6, 'Could not resolve host: na1.api.riotgames.com')
jjmaldonis commented 6 years ago

First, thanks for the code to reproduce the issue. I wasn't able to, but at least I know what code you are using.

Some quick googling says that this may be able to be fixed using an ethernet configuration change on your linux box. Can you do some googling yourself and see if those solutions fix your problem, and if not, report back with some more issues about what's going on. This may be something you need to bring up with the pycurl people (if it hasn't been brought up already -- it seems fairly common), but until you dig into it some more and figure out what's going on there isn't much we can do about it.