Closed daikman closed 2 years ago
I don`t think we should set the query limit to 100 if there is a filtering query with a larger response. As a user I would find it tedious to get a warning message instructing me to repeat the query specifying the number of rows, rather than getting that response in the first place - less clunky and frustrating from user perspective. Did you have specific reasoning to include the limit?
Also looked at the failed checks - think these will be resolved by updating the tests to expect the new error messages, but if not may need a bit more attention
I don`t think we should set the query limit to 100 if there is a filtering query with a larger response. As a user I would find it tedious to get a warning message instructing me to repeat the query specifying the number of rows, rather than getting that response in the first place - less clunky and frustrating from user perspective. Did you have specific reasoning to include the limit?
If there is no limit set in a datastore_search query, then it will return 100 results by default. I thought it would be good to make that explicit. But you are right that it would be annoying for users!
Instead, I've changed it so that if they don't set a limit themselves, it becomes 99999. If a query matches more rows than that, the warning tells them to download the whole resource and apply filters afterwards. If their query matches 99999 rows or less, then the user won't notice that this limit has been set.
Also looked at the failed checks - think these will be resolved by updating the tests to expect the new error messages, but if not may need a bit more attention
This is a good point about tests! Even if this isn't the reason for failed checks I should update them anyway.
If there is no limit set in a datastore_search query, then it will return 100 results by default. I thought it would be good to make that explicit. But you are right that it would be annoying for users!
Instead, I've changed it so that if they don't set a limit themselves, it becomes 99999. If a query matches more rows than that, the warning tells them to download the whole resource and apply filters afterwards. If their query matches 99999 rows or less, then the user won't notice that this limit has been set.
Good point - that sounds like a sensible solution. I think previously we used a single-row call first to determine the exact row number to return.
I've made three major changes to this pull request:
Refactoring the way the package makes API calls in order to better handle errors and reduce unnecessary API calls. Using {cli} to throw and format error messages. I also separated all functions into their own files and added tests for each.
Forgot to add any of your suggested changes @Moohan before I made these changes! But thanks for your comments they were very helpful
Would it be helpful to have a message also when dump is being used for row>99999 requests but there is no query/filtering? Just to avoid any confusion when the returned number of rows doesn`t match the rows requested.
Would it be helpful to have a message also when dump is being used for row>99999 requests but there is no query/filtering? Just to avoid any confusion when the returned number of rows doesn`t match the rows requested.
Good point! I've added another if statement to use_dump_check() that will give this warning
I've added the ability to filter rows and columns using the 'q' and 'fields' parameters of the datastore_search endpoint, respectively.
I also added a function that checks the connection to CKAN in order to give better error messages from
get_resource
andget_dataset
. And forget_resource
I added some extra error handling.Made small changes to documentation of functions too.