Open pdelboca opened 7 months ago
@wardi have you ever used ckanapi
to do a dump of a portal? I'm trying to do a dump of https://datos.gob.ar/ but it is extremely slow and it also gets "blocked" after 250 datasets. (Blocked = doesnt write any output, no progress, nothing is happening)
I'm trying to do:
ckanapi dump datasets --all --datapackages=./output_directory/ -r https://datos.gob.ar
@pdelboca we use it daily to create a history of our metadata for ~30k datasets. It's possible you're being throttled on the server side. dump datasets
makes a separate package_show
query for every dataset, you could try using search datasets
instead that paginates over package_search
instead for fewer requests.
It's possible to resume an interrupted load but not the dump command at the moment, maybe that's needed if you are being throttled.
Hello!
I'm trying to do a dump of an instance but the package is throwing some errors. This PR is to fix whatever is appearing.
Problems when logging errors
TODO: See #209
KeyError: 'format'