WordPress / openverse-api

The Openverse API allows programmatic access to search for CC-licensed and public domain digital media.
https://api.openverse.engineering/v1
MIT License
76 stars 50 forks source link

Tally provider occurrences in results #1088

Closed sarayourfriend closed 1 year ago

sarayourfriend commented 1 year ago

Fixes

Fixes #1084 by @sarayourfriend Also fixes #1072 by @AetherUnbound

Description

Tallies provider occurrences in Redis. Disambiguates by week (via a Monday date-stamp shared for all days of a given week), index, and provider.

Only tallies the first four pages.

Testing Instructions

Check out the new unit tests.

For manual testing: run the application with the instructions from the README (you will need to rebuild web if you already have it for the new test dependency, btw).

Then, make a search. Open up a redis connection (redis-cli or use a Redis connection from django_redis in just ipython) and execute SCAN 0 'provider*'. You should see the new tallies appear. You can use the keys that you get back from SCAN to view the individual results.

Checklist

[best_practices]: https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project#_commit_guidelines

Developer Certificate of Origin

Developer Certificate of Origin ``` Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. ```
github-actions[bot] commented 1 year ago

API Developer Docs Preview: Updating (This comment will be automatically updated with the preview URL once it is ready)

github-actions[bot] commented 1 year ago

This PR has migrations. Please rebase it before merging to ensure that conflicting migrations are not introduced.

stacimc commented 1 year ago

@sarayourfriend Can you expand on your instructions for manual testing?

sarayourfriend commented 1 year ago

@stacimc I'm happy to, but can you share which part of the manual testing instructions you're stuck on? I'm not sure what to expand on :confused:

sarayourfriend commented 1 year ago

Sorry, I left out the MATCH keyword for the SCAN argument (I just wrote it from memory). The Redis documentation is pretty good about these things though, if you run into similar issues in the future: https://redis.io/commands/scan/. FWIW, we have Redis mapped (kind of annoyingly, actually, because it conflicts with my actual personal redis on my machine) to the default Redis port out of docker-compose. If you run redis-cli on your local machine, outside of Docker, with the API running, it will connect to Redis directly, no need to exec into any containers. I didn't realise this wasn't well-known on the team, my apologies for not including the detailed instructions.

Testing using ipython is easier and as you did, can follow the code as an example.

stacimc commented 1 year ago

No worries. I looked at that exact documentation when debugging the syntax error 😅 I needed to add match and remove the quotation marks around the pattern. However even once I did that, I wasn't able to see the tallies. The only key I could see at all was filtered_providers. I was not sure if maybe I was missing a step to connect to the right cache?

Before that FWIW I also initially tried running redis-cli outside of Docker with the API running and it did not work for me -- I got connection errors when trying to connect. I did not spend much time trying to debug that part in particular, though 🤷‍♀️

sarayourfriend commented 1 year ago

I was not sure if maybe I was missing a step to connect to the right cache?

Ah, yes, that would be the issue.

Later I will open an issue to add more documentation for how to read from redis in development :+1: