cohere-ai / sandbox-grounded-qa

A sandbox repo for grounded question answering with Cohere and Google Search
MIT License
136 stars 17 forks source link

Log exception when web page cannot be loaded #8

Closed michaelwechner closed 1 year ago

michaelwechner commented 1 year ago

I experienced some cases where web pages were not available, because the server was down, or for example

ERROR: Page 'https://ch.linkedin.com/in/michaelwechner' could not be loaded! Exception message: HTTP Error 999: INKApi Error

or also because of issues re SSL certificate verification.

Therefore I think it would be good to log the exception message inside qa/search.py

def get_paragraphs_text_from_url(k):
    """Extract a list of paragraphs from the contents pointed to by an url."""

    i, search_result_url = k
    try:
        html = open_link(search_result_url)
        return paragraphs_from_html(html)
    except Exception as e:
        pretty_print("FAIL", f"ERROR: Page '{search_result_url}' could not be loaded! Exception: {e}")
        return []
michaelwechner commented 1 year ago

I could fix the SSL error with /Applications/Python\ 3.10/Install\ Certificates.command whereas logging the exception helped me to identify the problem

nickfrosst commented 1 year ago

oh yeah both of those would be useful. want to make a pr for it?

michaelwechner commented 1 year ago

oh yeah both of those would be useful. want to make a pr for it?

done :-)

https://github.com/cohere-ai/sandbox-grounded-qa/pull/9

michaelwechner commented 1 year ago

See PR at https://github.com/cohere-ai/sandbox-grounded-qa/pull/9