IATI / ckanext-iati

CKAN extension for the IATI Registry
http://iatiregistry.org
9 stars 6 forks source link

Decouple archiver from CKAN #313

Open andylolz opened 3 years ago

andylolz commented 3 years ago

This was discussed at the IATI technical audit in 2018. IIRC there was a firm recommendation from @davidmegginson (based on the experiences of HDX) that the registry archiver ought to be decoupled from CKAN, and interact via the CKAN API. My recollection is there was unanimously agreement on this. It featured in the recommendations here: Screenshot 2021-01-20 at 10 05 27

Since then, however, work has continued on the registry archiver inside of CKAN. There are ongoing issues with this.

I’ve raised this before on IATI Discuss [1] [2] [3]. I’m not sure why this recommendation from the technical audit was never taken forward, but I think it would still be a worthwhile pursuit. This would open up the possibility of expanding the role of the archiver to capture and present more metadata, e.g. from publishers’ org files.

andylolz commented 3 years ago

Adding a reference to #322 here, as it also appears related.

PetyaKangalova commented 3 years ago

@andylolz apologies for the late response.

Apologies for the late response.

  1. On the specific list of issues you have flagged related to the archiver. At the moment there are 3 open issues. We have provided updates on the issues and what Derlinx have done to fix the issues or in the process of fixing the issues. If you have any questions, please comment directly on the issue.

Could you let us know what your specific concerns are and what isn't working that would be addressed by modularising the extension?

  1. To your point on “ Focus on core CKAN capabilities scope. Build additional scripts to manage metadata as a service external to the Registry”

Following the tech audit we had discussions with David Megginson from HDX and their developers. They explained that HDX keeps the CKAN core intact and they do all the changes in extensions, using forks of other extensions for customisations, or building in house. They have added an interface on top of CKAN and extra metadata fields (with in house extensions). The key conclusion from our discussion was that from their experience a lot of developer time was required for the build and maintenance of the CKAN product. The HDX team had three full time developers working on this over a year, so the decision was made to monitor if this work was really going to be necessary for IATI. Since the Tech Audit in 2018, IATI has closely worked with Derilinx and delivery of the Registry product has significantly improved and issues around down time and response times have been solved.

As you are aware, the 2020 Technical Stocktake recommended that IATI tools should be part of an integrated system design and sit behind a unified API gateway. This process will happen over time. The Stocktake blog post in December outlines this work. Improving the documentation of the Registry’s API has been completed and any other necessary improvements we feel are required will be made as needed.

andylolz commented 3 years ago

Could you let us know what your specific concerns are and what isn't working that would be addressed by modularising the extension?

The registry archiver has certainly improved since the tech audit in 2018, thanks to derilinx and you and your team. But the issues raised about the archiver still stand. Its job is to update registry metadata, and it does not do that reliably. This means the metadata it outputs is also unreliable. #309 is a perennial issue to that effect.

I’ve just run some tests, and have raised tickets for the metadata issues I’ve found. I suspect there may be other issues.

Again, this has massively improved since derilinx took over. But there was a very clear and very good recommendation from the 2018 tech audit to decouple this one component, into a microservice that interacts with the registry via the registry API, and it wasn’t followed. That’s unfortunate.

The key conclusion from our discussion was that from their experience a lot of developer time was required for the build and maintenance of the CKAN product. The HDX team had three full time developers working on this over a year, so the decision was made to monitor if this work was really going to be necessary for IATI.

The tech audit reached exactly the same conclusion. I.e. don’t build a CKAN extension – build an external microservice that interacts via the CKAN API.

We have provided updates on the issues and what Derlinx have done to fix the issues or in the process of fixing the issues

Absolutely, yes – these updates are great and much appreciated.