open-contracting-archive / implementation-and-extensions

Proposed core changes and new extensions
MIT License
1 stars 1 forks source link

Paraguay: Releases and records - publishing approaches #9

Closed timgdavies closed 5 years ago

timgdavies commented 9 years ago

We should work towards a short document describing the way in which data in Paraguay will be published.

Below is my understanding based on conversations with @juanpane

(1) Currently there is an API which provides endpoints for plans, tenders, awards, contracts and changes to contracts

(2) Yearly CSV data dumps are also provided

(3) The database behind the scenes maintains an audit log of all changes, but the API endpoints only expose the current state of each phase of a contracting process

There are a number of ways users could be given access to the data:

Current state releases and summary record

In this scenarios, the API endpoints would provide OCDS JSON back in response to requests relating to a particular OCID. The JSON would be a single release, wrapped in a release package.

They would provide the current information they have (i.e. if a tender has been updated, only the latest state of the tender would be available at the tender endpoint for any given OCID).

To allow users to discover all the release API end points they should hit, an endpoint returning a summary record package for each OCID could be provided. At a minimum this would need to list the URIs of each relevant release for that OCID.

For example, a call to:

/api/record/ocds-123456-abcdefg

might return a package along the lines of:

{ 
  "uri":"http://example.com/api/record/ocds-123456-abcdefg",
  "publishedDate":"2015-09-14T17:51:31+00:00",
  "packages":[ 
       "http://example.com/api/planning/ocds-123456-abcdefg",
       "http://example.com/api/call/ocds-123456-abcdefg",
       "http://example.com/api/award/ocds-123456-abcdefg",
       "http://example.com/api/contract/ocds-123456-abcdefg",
   ]
  "...":"..."
}

as this at least lets the user know where to discover relevant data.

With this approach, it would be up to third-parties to regularly check the releases, and keep historical copies if they wanted, compiling the full records and versioned releases.

Improvements on this model would include:

I.e. each time the underlying database changes the information for a call, the release ID at the call endpoint should update, perhaps by incrementing a / suffix on the end of the ID.

Bulk release and record packages

All the releases can be periodically dumped out into a large release package, representing all those processes with changes in a given year, month or week.

There are some implementation issue with this that mean if a process changes multiple times in the period over which the export is carried out, only the releases with the state it is in at the point the data dump is created will be captured.

Alternative approaches being explored in some contexts are to trigger the write-out of a release file whenever the underlying database changes, and then to compile this collection of releases periodically.

User experience

ToDo: write out brief user stories on how different kinds of users would access the data.

@juanpane Can you check my descriptions above - and let me know if I've captured what was discussed or not?