WordPress / openverse-api

The Openverse API allows programmatic access to search for CC-licensed and public domain digital media.
https://api.openverse.engineering/v1
MIT License
77 stars 51 forks source link

Integrations tests fail #5

Closed obulat closed 3 years ago

obulat commented 3 years ago

Trying to run integration tests, I get several errors:

  1. On Mac, I get an error when I run pipenv install because there is no compatible wheel of Pillow package. It tries to build it, but fails because there are no required dependencies for compilation (zlib).

  2. When running shell inside Docker, I first get an error that importlib_metadata module is not found. After I install it using pipenv install importlib_metadata, I get an error that no compatible version of django was found, because drf-yasg requires version >= 2.2.16, and we have it fixed in Pipfile to 2.2.13. So, I installed drf-yasg version 1.17 instead of 1.20.

Then I am able to run the tests, but get several failures:

Testing LOCAL environment
E....F...ss...F.s....F.FsFF.

============================================================================ ERRORS ============================================================================
________________________________________________________ ERROR at setup of test_auth_email_verification ________________________________________________________

self = <django.db.backends.postgresql.base.DatabaseWrapper object at 0xffffbb1d5f50>

    def ensure_connection(self):
        """Guarantee that a connection to the database is established."""
        if self.connection is None:
            with self.wrap_database_errors:
>               self.connect()

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:217:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <django.db.backends.postgresql.base.DatabaseWrapper object at 0xffffbb1d5f50>

    def connect(self):
        """Connect to the database. Assume that the connection is closed."""
        # Check for invalid configurations.
        self.check_settings()
        # In case the previous connection was closed while in an atomic block
        self.in_atomic_block = False
        self.savepoint_ids = []
        self.needs_rollback = False
        # Reset parameters defining when to close the connection
        max_age = self.settings_dict['CONN_MAX_AGE']
        self.close_at = None if max_age is None else time.time() + max_age
        self.closed_in_transaction = False
        self.errors_occurred = False
        # Establish the connection
        conn_params = self.get_connection_params()
>       self.connection = self.get_new_connection(conn_params)

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:195:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <django.db.backends.postgresql.base.DatabaseWrapper object at 0xffffbb1d5f50>
conn_params = {'database': 'openledger', 'host': 'localhost', 'password': 'deploy', 'user': 'deploy'}

    def get_new_connection(self, conn_params):
>       connection = Database.connect(**conn_params)

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/postgresql/base.py:178:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

dsn = 'user=deploy password=deploy host=localhost dbname=openledger', connection_factory = None, cursor_factory = None
kwargs = {'database': 'openledger', 'host': 'localhost', 'password': 'deploy', 'user': 'deploy'}, kwasync = {}

    def connect(dsn=None, connection_factory=None, cursor_factory=None, **kwargs):
        """
        Create a new database connection.

        The connection parameters can be specified as a string:

            conn = psycopg2.connect("dbname=test user=postgres password=secret")

        or using a set of keyword arguments:

            conn = psycopg2.connect(database="test", user="postgres", password="secret")

        Or as a mix of both. The basic connection parameters are:

        - *dbname*: the database name
        - *database*: the database name (only as keyword argument)
        - *user*: user name used to authenticate
        - *password*: password used to authenticate
        - *host*: database host address (defaults to UNIX socket if not provided)
        - *port*: connection port number (defaults to 5432 if not provided)

        Using the *connection_factory* parameter a different class or connections
        factory can be specified. It should be a callable object taking a dsn
        argument.

        Using the *cursor_factory* parameter, a new default cursor factory will be
        used by cursor().

        Using *async*=True an asynchronous connection will be created. *async_* is
        a valid alias (for Python versions where ``async`` is a keyword).

        Any other keyword parameter will be passed to the underlying client
        library: the list of supported parameters depends on the library version.

        """
        kwasync = {}
        if 'async' in kwargs:
            kwasync['async'] = kwargs.pop('async')
        if 'async_' in kwargs:
            kwasync['async_'] = kwargs.pop('async_')

        if dsn is None and not kwargs:
            raise TypeError('missing dsn and no parameters')

        dsn = _ext.make_dsn(dsn, **kwargs)
>       conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
E       psycopg2.OperationalError: could not connect to server: Connection refused
E           Is the server running on host "localhost" (127.0.0.1) and accepting
E           TCP/IP connections on port 5432?
E       could not connect to server: Cannot assign requested address
E           Is the server running on host "localhost" (::1) and accepting
E           TCP/IP connections on port 5432?

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/psycopg2/__init__.py:127: OperationalError

The above exception was the direct cause of the following exception:

request = <SubRequest '_django_db_marker' for <Function test_auth_email_verification>>

    @pytest.fixture(autouse=True)
    def _django_db_marker(request):
        """Implement the django_db marker, internal to pytest-django.

        This will dynamically request the ``db``, ``transactional_db`` or
        ``django_db_reset_sequences`` fixtures as required by the django_db marker.
        """
        marker = request.node.get_closest_marker("django_db")
        if marker:
            transaction, reset_sequences = validate_django_db(marker)
            if reset_sequences:
                request.getfixturevalue("django_db_reset_sequences")
            elif transaction:
                request.getfixturevalue("transactional_db")
            else:
>               request.getfixturevalue("db")

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/pytest_django/plugin.py:439:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/pytest_django/fixtures.py:227: in db
    _django_db_fixture_helper(request, django_db_blocker, transactional=False)
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/pytest_django/fixtures.py:158: in _django_db_fixture_helper
    test_case._pre_setup()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/test/testcases.py:938: in _pre_setup
    self._fixture_setup()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/test/testcases.py:1169: in _fixture_setup
    self.atomics = self._enter_atomics()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/test/testcases.py:1107: in _enter_atomics
    atomics[db_name].__enter__()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/transaction.py:175: in __enter__
    if not connection.get_autocommit():
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:379: in get_autocommit
    self.ensure_connection()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:217: in ensure_connection
    self.connect()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/utils.py:89: in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:217: in ensure_connection
    self.connect()
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/base/base.py:195: in connect
    self.connection = self.get_new_connection(conn_params)
/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/django/db/backends/postgresql/base.py:178: in get_new_connection
    connection = Database.connect(**conn_params)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

dsn = 'user=deploy password=deploy host=localhost dbname=openledger', connection_factory = None, cursor_factory = None
kwargs = {'database': 'openledger', 'host': 'localhost', 'password': 'deploy', 'user': 'deploy'}, kwasync = {}

    def connect(dsn=None, connection_factory=None, cursor_factory=None, **kwargs):
        """
        Create a new database connection.

        The connection parameters can be specified as a string:

            conn = psycopg2.connect("dbname=test user=postgres password=secret")

        or using a set of keyword arguments:

            conn = psycopg2.connect(database="test", user="postgres", password="secret")

        Or as a mix of both. The basic connection parameters are:

        - *dbname*: the database name
        - *database*: the database name (only as keyword argument)
        - *user*: user name used to authenticate
        - *password*: password used to authenticate
        - *host*: database host address (defaults to UNIX socket if not provided)
        - *port*: connection port number (defaults to 5432 if not provided)

        Using the *connection_factory* parameter a different class or connections
        factory can be specified. It should be a callable object taking a dsn
        argument.

        Using the *cursor_factory* parameter, a new default cursor factory will be
        used by cursor().

        Using *async*=True an asynchronous connection will be created. *async_* is
        a valid alias (for Python versions where ``async`` is a keyword).

        Any other keyword parameter will be passed to the underlying client
        library: the list of supported parameters depends on the library version.

        """
        kwasync = {}
        if 'async' in kwargs:
            kwasync['async'] = kwargs.pop('async')
        if 'async_' in kwargs:
            kwasync['async_'] = kwargs.pop('async_')

        if dsn is None and not kwargs:
            raise TypeError('missing dsn and no parameters')

        dsn = _ext.make_dsn(dsn, **kwargs)
>       conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
E       django.db.utils.OperationalError: could not connect to server: Connection refused
E           Is the server running on host "localhost" (127.0.0.1) and accepting
E           TCP/IP connections on port 5432?
E       could not connect to server: Cannot assign requested address
E           Is the server running on host "localhost" (::1) and accepting
E           TCP/IP connections on port 5432?

/root/.local/share/virtualenvs/cccatalog-api-ouCQYms7/lib/python3.7/site-packages/psycopg2/__init__.py:127: OperationalError
=========================================================================== FAILURES ===========================================================================
______________________________________________________________________ test_image_detail _______________________________________________________________________

search_fixture = {'page_count': 0, 'page_size': 1, 'result_count': 1, 'results': [{'creator': 'Liam', 'creator_url': 'https://example.com/', 'detail_url': 'http://localhost:8000/v1/images/3', 'fields_matched': ['tags.name'], ...}]}

    def test_image_detail(search_fixture):
        test_id = search_fixture['results'][0]['id']
        response = requests.get(API_URL + '/v1/images/{}'.format(test_id), verify=False)
>       assert response.status_code == 200
E       assert 404 == 200
E        +  where 404 = <Response [404]>.status_code

test/v1_integration_test.py:88: AssertionError
_______________________________________________________________ test_creator_quotation_grouping ________________________________________________________________

    def test_creator_quotation_grouping():
        """
        Users should be able to group terms together with quotation marks to narrow
        down their searches more effectively.
        """
        no_quotes = json.loads(
            requests.get(
                API_URL + '/v1/images?creator=claude%20monet',
                verify=False
            ).text
        )
        quotes = json.loads(
            requests.get(
                API_URL + '/v1/images?creator="claude%20monet"',
                verify=False
            ).text
        )
        # Did quotation marks actually narrow down the search?
>       assert len(no_quotes['results']) > len(quotes['results'])
E       assert 0 > 0
E        +  where 0 = len([])
E        +  and   0 = len([])

test/v1_integration_test.py:221: AssertionError
______________________________________________________________ test_page_size_removing_dead_links ______________________________________________________________

search_without_dead_links = <function search_without_dead_links.<locals>._search_without_dead_links at 0xffffbab14cb0>

    def test_page_size_removing_dead_links(search_without_dead_links):
        """
        We have about 500 dead links in the sample data and should have around
        8 dead links in the first 100 results on a query composed of a single
        wildcard operator.

        Test whether the number of results returned is equal to the requested
        page_size of 100.
        """
        data = search_without_dead_links(q='*', page_size=100)
>       assert len(data['results']) == 100
E       AssertionError: assert 4 == 100
E        +  where 4 = len([{'creator': 'Alice Foo', 'creator_url': 'https://example.com/', 'detail_url': 'http://localhost:8000/v1/images/2', 'f.../example.com/', 'detail_url': 'http://localhost:8000/v1/images/1', 'foreign_landing_url': 'https://example.com/', ...}])

test/v1_integration_test.py:457: AssertionError
__________________________________________________________ test_page_consistency_removing_dead_links ___________________________________________________________

search_without_dead_links = <function search_without_dead_links.<locals>._search_without_dead_links at 0xffffbb957b00>

    def test_page_consistency_removing_dead_links(search_without_dead_links):
        """
        Test the results returned in consecutive pages are never repeated when
        filtering out dead links.
        """
        total_pages = 30
        page_size = 5

        page_results = []
        for page in range(1, total_pages + 1):
            page_data = search_without_dead_links(
                q='*',
                page_size=page_size,
                page=page
            )
            page_results += page_data['results']

        def no_duplicates(l):
            s = set()
            for x in l:
                if x in s:
                    return False
                s.add(x)
            return True

        ids = list(map(lambda x: x['id'], page_results))
        # No results should be repeated so we should have no duplicate ids
>       assert no_duplicates(ids)
E       AssertionError: assert False
E        +  where False = <function test_page_consistency_removing_dead_links.<locals>.no_duplicates at 0xffffb8bc9200>(['2', '2', '3', '1'])

test/v1_integration_test.py:508: AssertionError
________________________________________________________________ test_oembed_endpoint_for_json _________________________________________________________________

    def test_oembed_endpoint_for_json():
        response = requests.get(
            API_URL + '/v1/oembed?url=https%3A//search.creativecommons.org/photos/dac5f6b0-e07a-44a0-a444-7f43d71f9beb'
        )
>       assert response.status_code == 200
E       assert 500 == 200
E        +  where 500 = <Response [500]>.status_code

test/v1_integration_test.py:542: AssertionError
_________________________________________________________________ test_oembed_endpoint_for_xml _________________________________________________________________

    def test_oembed_endpoint_for_xml():
        response = requests.get(
            API_URL + '/v1/oembed?url=https%3A//search.creativecommons.org/photos/dac5f6b0-e07a-44a0-a444-7f43d71f9beb&format=xml'
        )
>       assert response.status_code == 200
E       assert 500 == 200
E        +  where 500 = <Response [500]>.status_code

test/v1_integration_test.py:554: AssertionError
=================================================================== short test summary info ====================================================================
FAILED test/v1_integration_test.py::test_image_detail - assert 404 == 200
FAILED test/v1_integration_test.py::test_creator_quotation_grouping - assert 0 > 0
FAILED test/v1_integration_test.py::test_page_size_removing_dead_links - AssertionError: assert 4 == 100
FAILED test/v1_integration_test.py::test_page_consistency_removing_dead_links - AssertionError: assert False
FAILED test/v1_integration_test.py::test_oembed_endpoint_for_json - assert 500 == 200
FAILED test/v1_integration_test.py::test_oembed_endpoint_for_xml - assert 500 == 200
ERROR test/v1_integration_test.py::test_auth_email_verification - django.db.utils.OperationalError: could not connect to server: Connection refused

Sorry for including so much.

obulat commented 3 years ago

The Docker dashboard says that ccsearch-api_db_1 container is running, but I can't open local host at port 5432. Am I supposed to be able to open db at that port?

dhruvkb commented 3 years ago

Yes @obulat, if the DB server is running you should be able to access Postgres using psql:

psql -U deploy -d openledger -p 5433 -h localhost
Screenshot 2021-04-21 at 2 51 42 PM
dhruvkb commented 3 years ago

To reproduce, I opened a remote connection with docker compose exec and ran the test. I got just one failure and that too because Django wants to connect with Postgres at localhost:5432 and Redis at localhost:6379 which I suppose won't be accessible from within the container.

I didn't face the dependency issue that you mentioned.

Testing LOCAL environment
- E....F...ss...F.s....F.FsFF.
+ E........ss.....s.......s...

So to fix the DB and Redis connections, I updated the hosts inside the run_tests.sh file, replacing localhost with the names of their services respectively.

- DJANGO_DATABASE_HOST='localhost' REDIS_HOST='localhost'
+ DJANGO_DATABASE_HOST='db' REDIS_HOST='cache' 

Then all the tests passed.

Testing LOCAL environment
.........ss.....s.......s...

Also I ran pipenv install locally (using the Rosetta terminal) and that also worked without dependency issues.