NRGI / resourcedata.org

CKAN
3 stars 1 forks source link

User story: Provide update to users when new reporting years has been added at country level #139

Closed anderspeders closed 6 years ago

anderspeders commented 7 years ago

User story:

As a data user for Guine and Ghana I would like to know when ne data has been added to these specific countries so that I can rerun my analysis or conduct new analysis with the more recent figures.

Alert should follow once a new [year] has been added for a specific [country]. The alert could be as email, RSS feed or other.

Now that we have the metadata added for additions to rows eg. year 2014 added for Afghanistan it should be possible to generate some type of alert for this.

Maybe worth pinging the CKAN developer list for this?

What

Notes

mattfullerton commented 7 years ago

To summarize, this would be our plan for implementation:

mattfullerton commented 7 years ago

De-milestoning while @anderspeders clarifies demand for this feature

mattfullerton commented 7 years ago

Comment from Anders on Slack:

Thinking about a country focused monthly email: excisting and new extractives data for - for example Colombia

As a user I would like to know:

t-morrison commented 7 years ago

I would like to explore the most simple implementation of this request. I am thinking that would be:

Is this the easiest way to start this? I can envision a fairly simple R solution. What would be involved for you to implement as part of the server @mattfullerton (in your own way, not something I did in R)? Or do you have a better idea?

mattfullerton commented 7 years ago

If we go at it this way (which is also what we were originally thinking), the service can be any script, even R (there's Rscript for running things like this). Your idea with the timestamps has the big advantage that we don't need to track changes, but just ask the data.

However, if we stay within CKAN, what I'm wondering is do we achieve exactly the same functionality if we let people know any time the resource has been updated; given we only update it when rows have actually changed? I'm not sure that functionality is in there, but it is conformant with the idea of CKAN notifications - when something changes, subscribers get notified, and it would save us a pile of overhead running scripts, managing email subscriptions etc. We can pair this with the idea of datastore views, where we could even create "virtual" resources for countries and years that don't exist yet.

The middle road is to use ckanext-hooks which will trigger some external service when something happens; i.e. resource gets updated, external service (again, could be almost anything but preferably something that is good for writing APIs/Web services) checks whether what happened is relevant for a subscriber, logs it and includes it in e.g. a weekly or monthly update email for that person.

t-morrison commented 7 years ago

I don't quite follow the second option there. To clarify, I'm not suggesting we have an option for someone to only get an update for say the Nigeria dataset, it would be all or nothing (essentially tracking only the Complete dataset). We could script XXX new rows for Nigeria since MM/DD/YY for example. This would make overhead easier, yes (or were you were thinking the same already?) ?

One concern on notifying on any changes: looking at the data, there were changes March 3rd, 6th, 9th, 11th and 13th this year, as one example. It would be too much to send a notification each time.

Can you briefly describe the process for managing subscribers for my suggestion and yours above? What is easier with your option ("would save us a pile of overhead running scripts, managing email subscriptions etc")?

mattfullerton commented 6 years ago

@moman822 To (finally) answer your question, the work/maintenance saved would be through using CKAN's existing "follow" functionality: CKAN manages subscriptions and sends the email.

I will look at how far we get with notifications on resource updates in the EITI data

mattfullerton commented 6 years ago

Quick update: I've tested the standard CKAN alerting with EITI resources - the result was that I needed to add an explicit field for the resources to mark when they've been updated (in my opinion, CKAN should be doing this, but it neither updates the right field (https://github.com/ckan/ckan/issues/3907) nor triggers an activity when the file is changed (which may well be because no field is updated)).

We have the flexibility to set up how often alerts are sent by email (e.g. once a week) and for how long back the alert email looks for events of interest to the user. Obviously these two should be coordinated.

The alert email should(?) also be extended to show what has changed. Otherwise the user first has to go to their list of activities, which may also be a little overwhelming. What I would consider doing is parsing the list of things that have changed (i.e. 6 EITI datasets changed 3 times) and consolidating them to say what has changed during the entire period.

We could test this for a while with just subscribing to the complete dataset. There we ought to get fairly frequent changes.

A next step would then be to extend CKAN to allow "following" searches - i.e. anything and everything about Nigeria.

Thoughts welcome

t-morrison commented 6 years ago

Implementing on the complete dataset to start would be great. For timeframe one week alerts would be fine. I agree that it should show what has changed.

From your example "(i.e. 6 EITI datasets changed 3 times)" - this would be more than following a single dataset? So more of an entire site follow for any and all changes?

mattfullerton commented 6 years ago

The example was based on the idea that anyone could follow any dataset (in fact, that is standard). What is coming in the email alert by default is just a message saying that something changed and the user can look at the activity list - I think including some content on what changed is important for this use case.

t-morrison commented 6 years ago

@mattfullerton can you give some detail on how this alert will be implemented from a user perspective?

Will they need to have registered an account or can they just be prompted to enter an email address? Will this happen using the "follow" button on the resource page (e.g. here https://www.resourcedata.org/dataset/eiti-complete-summary-table)?

mattfullerton commented 6 years ago

@moman822 Users need to register and enable the sending of email alerts in their profile. They need to follow a dataset or organisation. What's particularly new is that they can also click to follow a [faceted] search.

mattfullerton commented 6 years ago

The major work on this is now complete.

It works as follows and can be tested on staging:

Note: users need to be real CKAN users, and need to allow emails to be sent under the "Manage" button under their profile (https://staging.resourcedata.org/user/edit)

Remaining TODOs:

Minor TODOs:

t-morrison commented 6 years ago

@mattfullerton @deirdrelee I'm following up on Deirdre's email and the above here with some questions/comments:

mattfullerton commented 6 years ago

@moman822

If I follow multiple datasets, and they both change, will I receive separate emails?

No. All dataset/group/organization activity is summarized in one email, and the email only provides a link to the site where the activity is listed. You also won't get emailed if you've looked at the activity list. The saved search email is a separate email.

I have followed the EITI complete dataset and have received emails today and yesterday- are these legitimate changes to the data or just some test process that is set up?

They are legitimate in that the EITI harvester runs every day and for testing we have search-checking/email-sending also running every hour for testing purposes (the idea would be every week). We may need to take a closer look though at the contents of the file from one harvest to the next - to check that its not just some line ordering change or similar.

What is the possibility for alerting to specific changes in the EITI complete dataset, e.g. new country-year added?

You can either follow the individual dataset (i.e. https://staging.resourcedata.org/dataset/eiti-summary-data-table-for-norway) or create a search and follow that (i.e. https://staging.resourcedata.org/dataset?_country_limit=0&country=Norway) or if you're a pro-user you could create a saved search that expresses the data you want to be there but isn't yet (i.e. https://staging.resourcedata.org/dataset?_country_limit=0&country=Norway&year=2016)

Can we provide an initial email when following, something like: "You have subscribed to receive updates on the following [dataset/facet search]: XXXXX..."

Definitely possible, but I'm a bit unsure of the value in the case that we provide a listing of what searches have been saved (which we should do I think, along with a delete button). And then there is the question of when to send it? If its triggered by every saved search, a user might end up with 3 or 4 emails from one browsing session. But maybe that's OK?

Can we add a brief explainer popup/hover for the follow button?

Yes, have added to TODO list above

mattfullerton commented 6 years ago

This is more or less done from our side; new version including the UI (on the user page beside Activity Stream) is being pushed to staging now.

anderspeders commented 6 years ago

@moman822 Are we at a stage where we could send a test round to a few colleagues from the staging environment - or does it need a full deployment in order to be tested?

t-morrison commented 6 years ago

Following the discussion today:

anderspeders commented 6 years ago

@EricSoroos Could you please provide an update here

EricSoroos commented 6 years ago

Closing based on the merge of #216