google / corpuscrawler

Crawler for linguistic corpora
190 stars 56 forks source link

Undefined names #87

Open cclauss opened 3 years ago

cclauss commented 3 years ago

% flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics

./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'sitemap'
        if pubdate is None: pubdate = sitemap[url]
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'url'
        assert doc.status == 200, (doc.status, url)
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'url'
        assert doc.status == 200, (doc.status, url)
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'striptags'
                p = ' '.join(striptags(replace_html_entities(p)).split())
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'replace_html_entities'
                p = ' '.join(striptags(replace_html_entities(p)).split())
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'fetchresult'
        if pubdate is None: pubdate = fetchresult.headers.get('Last-Modified')
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'crawl_bibleis'
    crawl_bibleis(crawler, out, bible='THATSV')
./corpuscrawler/Lib/corpuscrawler/ F821 undefined name 'start_url'
        assert doc.status == 200, (doc.status, start_url)
8     F821 undefined name 'fetchresult'

On the flake8 test selection, this PR does not focus on "style violations" (the majority of flake8 error codes that psf/black can autocorrect). Instead, these tests are focus on runtime safety and correctness: