IATI / IATI-Datastore

An open-source datastore for IATI data with RESTful web API providing XML, JSON, CSV plus ETL tools
http://datastore.iatistandard.org/
Other
1 stars 0 forks source link

Update base URLs for the Registry #307

Closed dalepotter closed 6 years ago

dalepotter commented 6 years ago

As part of the migration of the IATI Registry to a new supplier, the Registry is now hosted at https://iatiregistry.org with 301 redirects from URLs prepended with http and/or www.

However, the ckanapi package cannot handle 301 redirects, which caused ValidationError exceptions on the 301 Moved Permanently content that was returned. This commit updates the URLs so that expected content is returned.

coveralls commented 6 years ago

Coverage Status

Coverage remained the same at 49.309% when pulling b823c28a0ba265226ec8f1fa204adbe9ea922f95 on update-registry-base-urls into 841472a7702a790d3f6e7e211b723e0ed97a0719 on master.

andylolz commented 6 years ago

Right yes, I see…

Fix looks good to me.

There’s an instance in the README, too, though of course that’s not part of the bug.

dalepotter commented 6 years ago

There’s an instance in the README, too, though of course that’s not part of the bug.

Good spot @andylolz - I'm less worried about that one, though a browser will follow the redirect so it's not really causing any loss of functionality/page error. Over time, the link should probably be fixed for consistency though!

andylolz commented 6 years ago

Looks like the same issue has come up at least twice before: #244; #272. @dalepotter would you accept a PR to fix the underlying problem?

Also, I suppose this means that forcing no-www on iatiregistry.org was a breaking change. While it’s likely that most will follow 301s, I’m not totally sure if it’s safe to assume that other products (e.g. publishing tools) won’t fail in a similar way. In particular, POST requests are liable to fail.

dalepotter commented 6 years ago

@dalepotter would you accept a PR to fix the underlying problem?

What sort of thing are you thinking?

andylolz commented 6 years ago

What sort of thing are you thinking?

Upgrading to a more recent ckanapi version, and ensuring GET requests are always used.

ckanapi >=3.2 uses requests, and includes a get_only initialisation parameter that forces GET requests: https://github.com/ckan/ckanapi/blob/e7977ae43e5b8d4cb4f292020afb0794a46690dc/ckanapi/remoteckan.py#L31

Obviously it’s still preferable to avoid hitting lots and lots of redirects (which will ultimately slow things down!) so the fix in this PR is good. But adding this should prevent redirects from breaking everything.

It’s interesting that 5 years on from ckanapi 1.5, it still breaks by default when hitting a redirect (by design). I’m quite surprised about that. It does make me wonder if it’s worth mentioning on the IATI discuss post about the registry upgrade that the registry now strips the www, so API tools may need to be updated accordingly.

andylolz commented 6 years ago

It does make me wonder if it’s worth mentioning on the IATI discuss post about the registry upgrade that the registry now strips the www, so API tools may need to be updated accordingly.

I’ve just done this, because it seems like good practice to mention it (regardless of the fact it’s a bit of an edgecase.)