I am losing my mind over a simple problem. I maintain a website that fetches game stats. There are 25000 clans with 50 players in each clan. I have written a script that fetches each clan and updates all stats for all players. However, after 7 hours, I run out of memory and the process is restarted. I request about 60 https calls per second.
Expected behavior
For the memory to be static over time.
Actual behavior
Instead, the memory will be like a waterfall and eventually kill the process. I also have mongo on the same server which sometimes gets killed.
Steps to reproduce
This is my code that does the actual work.
import asyncio
import logging
import os
from urllib.parse import quote
import aiohttp
import requests
API_TOKEN = os.getenv('API_TOKEN')
HEADERS = {'authorization': 'Bearer ' + API_TOKEN}
logger = logging.getLogger(__name__)
class ClanNotFound(Exception):
pass
async def __fetch(url, session):
async with session.get(url) as response:
return await response.json()
def __get_all(urls):
loop = asyncio.get_event_loop()
jar = aiohttp.DummyCookieJar()
with aiohttp.ClientSession(loop=loop, cookie_jar=jar, headers=HEADERS) as session:
futures = [__fetch(url, session) for url in urls]
responses = loop.run_until_complete(asyncio.gather(*futures))
return responses
def fetch_all_players(clan):
logger.info(f"Fetching all player stats for {clan['tag']}.")
tags = [member['tag'] for member in clan['memberList']]
urls = ['https://api.clashofclans.com/v1/players/' + quote(tag) for tag in tags]
return __get_all(urls)
Your environment
I am running Python 3.6.4 in Docker on DigitalOcean with 2GB of RAM.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that new issue.
Long story short
I am losing my mind over a simple problem. I maintain a website that fetches game stats. There are 25000 clans with 50 players in each clan. I have written a script that fetches each clan and updates all stats for all players. However, after 7 hours, I run out of memory and the process is restarted. I request about 60 https calls per second.
Expected behavior
For the memory to be static over time.
Actual behavior
Instead, the memory will be like a waterfall and eventually kill the process. I also have mongo on the same server which sometimes gets killed.
Steps to reproduce
This is my code that does the actual work.
Your environment
I am running
Python 3.6.4
in Docker on DigitalOcean with 2GB of RAM.I am using
client
.My Docker file is here.
I don't know what else to try. I saw some memory related issues regarding HTTPS but it seems those have been fixed.
How should I go about debugging this?