Closed rufuspollock closed 11 years ago
This needs to be discussed further according to call today. Will review within IATI group and decide on business logic.
The main use case envisaged was consuming tools which wanted to fetch any activities which had changed since they last refreshed their data.
For example:
The last_updated_datetime property of individual activities is neither universally applied, nor reliable. The last_modified of parent files would not work for the above use cases. Therefore this is a property that the store will need to work out for itself based on and it's semantics should be "when did the store first see this activity; or when did the store last see a change in this activity".
The last_changed date is primarily required as an index for querying (essential). However, it would be useful to include in output also (desirable). When storing it in an XML blog, or including it in output, it could be placed within a custom namespace: e.g. 'store:last_changed', to separate it from the raw XML.
Note that because some producing applications set last_updated_datetime as the time the IATI file was generated by an API, this is not a reliable measure at all of when the activity was last changed.
Other applications have addressed this by performing a string comparison of a stored iati-activity element and an incoming iati-activity element with the last_updated_datetime field removed (using a regex / xml dom).
Use case is:
User wants to fetch all activities that have changed since they last looked for data.
A user may use this in combination with a call to look for deleted activities to get their own system in sync with IATI data without doing a full refresh from the registry / the data store.
@practicalparticipation, We're storing the XML blobs, so the code just takes a hash of the existing and new activity for each resource and compares the hash string and keeps the old 'last_change' date if they match. I've added a filter 'last_change' and added it to the json output for now.
Great. And as we're doing this at the activity level I think it avoids the need to remove any generated date-time information (found in the
I've just checked and assume this isn't live yet. Will look out for when tagged for test.
it's currently live, but most of the activites will probably have an innaccurate last changed date as they were parsed prior to this being added
Trying with http://iati-datastore.herokuapp.com/api/1/access/activity?last-change__gt=2013-06-01 I get a 'bad filter' - and then, bizarrely, in the Chrome I get told that the page contains elements common on Phishing sites...
Can you post example URL to test with?
@practicalparticipation, I didn't add the last-change filters to the validation stage before the filtering. I've added it in the latest commit and the url you posted should be working
The chrome phising warning is strange indeed. I'll have a poke around to see if I can find out what is triggering it.
This appears to be working - great.
Ditto
last_changed = last_updated_datetime || last_modified of parent file (to be discussed further)