Closed IanHopkinson closed 4 days ago
@turnerm I added a list
action to scan
which replicates the list
command but applies by default to all of HDX. It is more performant than I expected <10s to show the values of 2 keys for all datasets (based on a cached version of the data). I expect it will run in >1s for you ;-)
Purpose
Version for this PR: 2024.8.2
The aim of this PR is to provide functionality for actions on all datasets in HDX. This uses the
package_search
endpoint in CKAN directly to download the whole catalogue. The actions supported aresurvey
which counts occurrences of a key,distribution
which calculates the distribution of values in a key,list
which lists the values of a key for each dataset (like the existing list command) anddelete_key
which deletes a selected key. This final action was the original purpose of the update - to removeresources._csrf_token
keys, this functionality is limited to only accept the_csrf_token
andextras
keys as arguments.Under this PR some investigation was done to look at a false report of an "extras" key error. The check for the extras key in parsing the traceback is not specific enough and can mask other errors and the test is not actually testing for the error on the live site.
Major file changes
This PR adds the
ckan_utilities.py
andtest_ckan_utilities.py
files which implement the scan functionality. TheDEMO.md
file is renamed toUSERGUIDE.md
and updated to include the new command, and reorganised to reflect the more mature status of the project.Minor file changes
tests/test_hdx_utilities_integration.py
appears to have changed a lot but this is just a result of a change in line endings configuration. The new content in this file is the test functiontest_get_hdx_url_and_key
, which tests the newget_hdx_url_and_key
function.Versioning
hdx-cli-toolkit
uses the CalVer versioning scheme with format YYYY.MM.Micro i.e. 2022.12.1 which is updated manually inpyproject.toml
. The "Micro" component is simply an integer increased by 1 at each version, starting from 0.pyproject.toml
and PR description