codeforIATI / iatikit

🐨 A toolkit for using IATI data
https://iatikit.readthedocs.io
MIT License
6 stars 0 forks source link

Given an IATI (activity) identifier, find the activity (efficiently!) #3

Open andylolz opened 5 years ago

andylolz commented 5 years ago

Doing this efficiently means leaning on the rule that all activity IDs must be prefixed by the publisher’s org ID.

E.g.:

import iatikit

activity_id = 'GB-COH-07676886-999999'

activity = iatikit.activities().find(iati_identifier=activity_id)

if activity:
    print('success! Found activity with IATI identifier: "{}"'.format(activity_id))
else:
    print('failed! Could not find an activity with IATI identifier: "{}"'.format(activity_id))

# failed! Could not find an activity with IATI identifier: "GB-COH-07676886-999999"
andylolz commented 5 years ago

Obviously this sort of thing is already possible:

for p in iatikit.publishers():
    activity = p.activities.find(iati_identifier=activity_id)
    if activity:
        break

This would be the way to do it inefficiently. I’m looking for the quick way!

andylolz commented 5 years ago

Done in d32bc5210063939694f223223245b1afdea0fbd0.

andylolz commented 5 years ago

So, this now works, and is quite fast:

import iatikit

activity_id = 'GB-COH-07676886-999999'

try:
    activity = iatikit.data().activities.find(iati_identifier=activity_id, fast=True)
    print('success! Found activity with IATI identifier: "{}"'.format(activity_id))
except IndexError:
    print('failed! Could not find an activity with IATI identifier: "{}"'.format(activity_id))

# failed! Could not find an activity with IATI identifier: "GB-COH-07676886-999999"
andylolz commented 5 years ago

Reopening because I removed this feature. It was experimental and had some problems.

andylolz commented 5 years ago

One of the problems here is: publishers can change their org ID, but they don’t declare this at org level (either in their org metadata or in their org file). So it’s non-trivial to do a fast lookup for activities that use old org IDs.