Closed frhino closed 3 years ago
Per discussion earlier this evening and in Slack, it looks like the problem is that the Marin module is not properly paging through the full set of data. It uses data.utils.SocrataApi.resource()
, which is somewhat simplistic and just makes a single HTTP request, without trying to determine if it needs to page through more results:
Instead, it should act like our other, slightly more complex API clients and automatically page through the full result set.
For examples of how we do this elsewhere, see:
Socrata pagination docs: https://dev.socrata.com/docs/paging.html
It looks like there's a Python package available for interacting with the SODA API, called sodapy
, which has a get_all()
method to handle the pagination for a given data set. Thoughts on the pros and cons of using that, as opposed to our home-grown wrapper?
As a short-term fix, I think it’s good to update out minimal client (less impact on the rest of the codebase needing to change to fit sodapy’s API) like you’re doing in #207, but switching to a more robust and maintained package would probably be a good follow-on!
Describe the bug Appears to be a pagination issue where the data exists but we're just not accessing all of it from our data file.
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Screenshots
Additional context Add any other context about the problem here.