brunoamaral / gregory-ai

Artificial Intelligence and Machine Learning to help find scientific research and filter relevant content
https://gregory-ai.com/
Other
47 stars 7 forks source link

naive datetime warning #274

Closed brunoamaral closed 1 year ago

brunoamaral commented 2 years ago

/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py:1043: InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.clinicaltrialsregister.eu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings

brunoamaral commented 2 years ago

Partially fixed in branch.

The first warning was for the discovery_date field. The second warning refers to the published_date that we get from the rss feed.

sources = Sources.objects.filter(method='rss',source_for='trials')

for i in sources:
    source_name = i.name
    source_for = i.source_for
    link = i.link
    d = None
    if i.ignore_ssl == False:
        d = feedparser.parse(link)
    else:
        response = requests.get(link, verify=False)
        d = feedparser.parse(response.content)
    for entry in d['entries']:
        summary = ''
        if hasattr(entry,'summary_detail'):
            summary = entry['summary_detail']['value']
        if hasattr(entry,'summary'):
            summary = entry['summary']
        published = entry.get('published')
        if published:
            published = parse(entry['published'])
        link = remove_utm(entry['link'])
        try:
            trial = Trials.objects.create( discovery_date=timezone.now(), title = entry['title'], summary = summary, link = link, published_date = published)
        except:
            pass