Closed m-appel closed 8 months ago
i had modifiied the code and provided a snipped for fac.py can you review it , is that right or wrong?
import argparse
import json
import logging
import os
import sys
from datetime import timedelta # Added import
import flatdict
import iso3166
import requests_cache
from iyp import BaseCrawler
from iyp.crawlers.peeringdb.ix import handle_social_media
# ... (existing code)
# Add this block to set up CachedSession with expire_after
expire_after_days = 6
cache_expire_time = timedelta(days=expire_after_days)
session = requests_cache.CachedSession(expire_after=cache_expire_time)
class Crawler(BaseCrawler):
def _init_(self, organization, url, name):
"""Initialization for pushing PeeringDB facilities to IYP."""
self.headers = {'Authorization': 'Api-Key ' + API_KEY}
self.requests = session # Use the session with expire_after
super()._init_(organization, url, name)
# ... (existing code)
@m-appel
This looks almost correct, but I would place your code block inside the __init__
function to not create a global variable.
And (personally) I would not create two variables for the number of days and the timedelta since expire_after=timedelta(days=6)
already seems expressive enough for me, but that's just personal taste.
@m-appel thankyou for the review 😊, next will keep in mind
In places where we use
requests_cache.CachedSession
(currently only in the PeeringDB crawlers), we should add theexpire_after
parameter. By default, responses are cached indefinitely (see here), so if we do not properly delete the cache between our weekly iterations, we are actually always recycling the first result.I guess a suitable value would be six days, so that we can rerun the crawler(s) if necessary, but are guaranteed to get fresh data in our weekly interval. This can be achieved with
timedelta
: