webrecorder / warcio

Streaming WARC/ARC library for fast web archive IO
https://pypi.python.org/pypi/warcio
Apache License 2.0
371 stars 58 forks source link

Offline tests #133

Open Luflosi opened 3 years ago

Luflosi commented 3 years ago

I'm packaging this library in Nixpkgs. The sandbox in which all packages are built has no internet connectivity to achieve reproducibility. This causes some tests to fail. It would be nice to have a --offline flag for pytest that skips all tests, that require internet connectivity. The unrelated project https://github.com/FPGAwars/apio for example has such a flag (for inspiration). At the moment I'm patching the source to manually remove the failing tests, which is a bit ugly and will break if there are changes in the wrong places in the code in the future. Other packaging efforts may also benefit from such a flag.

Patch ```patch diff --git a/test/test_capture_http.py b/test/test_capture_http.py index 41274d8..cd40075 100644 --- a/test/test_capture_http.py +++ b/test/test_capture_http.py @@ -5,7 +5,7 @@ import time # must be imported before 'requests' from warcio.capture_http import capture_http -from pytest import raises +from pytest import raises, skip import requests import json @@ -148,6 +148,8 @@ class TestCaptureHttpBin(object): assert data == 'somedatatopost' def test_post_chunked(self): + skip('requires internet connection') + warc_writer = BufferWARCWriter(gzip=False) def nop_filter(request, response, recorder): @@ -269,6 +271,8 @@ class TestCaptureHttpBin(object): os.remove(full_path) def test_remote(self): + skip('requires internet connection') + with capture_http(warc_version='1.1', gzip=True) as writer: requests.get('http://example.com/') requests.get('https://google.com/') diff --git a/test/test_capture_http_proxy.py b/test/test_capture_http_proxy.py index fba301d..a64e096 100644 --- a/test/test_capture_http_proxy.py +++ b/test/test_capture_http_proxy.py @@ -7,7 +7,7 @@ import time import requests from warcio.archiveiterator import ArchiveIterator -from pytest import raises +from pytest import raises, skip # ================================================================== @@ -45,6 +45,8 @@ class TestCaptureHttpProxy(): time.sleep(0.1) def test_capture_http_proxy(self): + skip('requires internet connection') + with capture_http() as warc_writer: res = requests.get("http://example.com/test", proxies=self.proxies, verify=False) @@ -64,6 +66,8 @@ class TestCaptureHttpProxy(): assert next(ai) def test_capture_https_proxy(self): + skip('requires internet connection') + with capture_http() as warc_writer: res = requests.get("https://example.com/test", proxies=self.proxies, verify=False) res = requests.get("https://example.com/foo", proxies=self.proxies, verify=False) @@ -110,6 +114,8 @@ class TestCaptureHttpProxy(): assert next(ai) def test_capture_https_proxy_same_session(self): + skip('requires internet connection') + sesh = requests.session() with capture_http() as warc_writer: res = sesh.get("https://example.com/test", proxies=self.proxies, verify=False) ```
ikreymer commented 3 years ago

Would you mind submitting this patch as a PR based on the solution you suggest? :)

It seems like that's fairly close to what you have already since you'e identified the tests that need to be excluded, just need to then add an offline flag (https://github.com/FPGAwars/apio/blob/develop/test/conftest.py#L32) and changing the skip to check the flag like https://github.com/FPGAwars/apio/blob/develop/test/packages/test_api.py#L7

Since you've already written most of it, hopefully something that can be quick? Thanks, and we can merge it into a new release.

Luflosi commented 3 years ago

To be honest, I don't know anything about pytest but I'll try.