IFRCGo / go-api

MIT License
14 stars 6 forks source link

Time Machine - Historical database snapshots #1003

Open tovari opened 3 years ago

tovari commented 3 years ago

This functionality would make a snapshot daily on the important operational data in a database. The data would be available for the users through an API endpoint.

The following attributes should be saved:

  1. Appeal coverage
  2. Deployments active by day
  3. 3W (see projects' data over time)
  4. Daily user logins (frontend, admin)
JonathanGarro commented 3 years ago

@tovari just to flesh this list out, here are the attributes I'd like to see over time:

  1. Appeal coverage
  2. Deployments active by day
  3. 3W (see projects' data over time)
jhenshall commented 3 years ago

Sounds good @tovari. Just to flag that I assume this will also capture imperfections in the data (e.g. appeals not linked to emergencies)? Stresses the importance of timely gardening and data cleaning - so that the 'look back' is to the best data possible.

tovari commented 3 years ago

Good point @jhenshall, we should consider this when planning the database relations and I hope we can avoid these problems.

tovari commented 3 years ago

I tried to detail what we would need. Would you mind to check these? Especially as I'm not sure about a few points.

• Appeal coverage
    ○ Number of beneficiaries
    ○ Amount requested
    ○ Amount funded
    ○ Last modified
• Global and regional key metrics:
    ○ Active DREF Operations
    ○ Active Emergency Appeals
    ○ Fuding requirements
    ○ Funding coverage
    ○ Targeted population

• Deployments
    Aggregated active deployments:
    ○ Deployed ERUs by type
    ○ Deployed RR by NS
    ○ Deployed Heops
Does it make sense to save the complete list of personnel deployment? - probably not, it can be taken from the deployments data. The goal here is rather having easy to use statistics, right?

• 3W data:
    ○ Do we need details for each running project ilke # of people reached, or budget?
    ○ Or we only need aggregated data per country or by region?
• User logins:
    ○ Daily frontend logins: username, NS (might change over time)
    ○ Daily admin site logins: same details
JonathanGarro commented 3 years ago

Appeal coverage

Global and regional key metrics

Deployments

Does it make sense to save the complete list of personnel deployment? - probably not, it can be taken from the deployments data. The goal here is rather having easy to use statistics, right?

3W data

Do we need details for each running project ilke # of people reached, or budget? Or we only need aggregated data per country or by region?

User logins

  • Daily frontend logins: username, NS (might change over time)
  • Daily admin site logins: same details

I think these both make sense to track, though I want to highlight that we're having discussions about tracking user engagement on a deeper level than just logins. Depending on how those play out, we might want to build additional points on this list. Side question - we don't have a good way of aggregating and reporting against user roles, right? I think when you register that is just a plain text field, but we might want to consider adding a dropdown with role type (e.g. sector advisor, finance, IM, operations, etc).

nanometrenat commented 3 years ago

Definitely worth thinking carefully as to how many of the above can/should be covered by audit trail/history vs 'snapshot' approach. e.g. changed appeal end dates should presumably come from audit trail for that appeal rather than from a snapshot?

nanometrenat commented 3 years ago

Also, when talking about user logins (to either front office or back office) it's worth considering whether it's the 'log in' process you want to track, or the fact that that user has accessed GO. Lots of users (like me) save their credentials so don't visit the login screen very often. xref https://github.com/IFRCGo/go-frontend/issues/956#issuecomment-670468239 from conversations with Gergely etc. Either way, suspect a 'snapshot' isn't as useful for this as saving the full history of the user...

tovari commented 3 years ago

The business request was to have a functionality that allows to check the GO data on a given date in the past (e.g. funding coverage on 31.12.2020.). For this purpose, I think, the 'snapshot' approach would work well. If we are also interested in when a certain attribute has been changed, then we also need the audit trail/history. So, I'm wondering, which of the above attributes needs history? We have data history for the 3W data already (and 3W projects can be reverted back to previous states), but where else could be that important? History data need API endpoints as well, if we want to analyze the data. Regarding user logins, I agree, we should rather save the date of last user activity on the frontend, than the login date.

nanometrenat commented 3 years ago

@JonathanGarro one thing re Appeal Coverage - are you meaning Appeals only or Appeals and DREFs? In particular I'm not sure how GO handles (or whether in fact it differentiates) between DREF grants and DREF loans. DREFs always show as 100% funded as they come out of the DREF pot, but it's just that some need to be repaid to the DREF pot and some don't.

JonathanGarro commented 3 years ago

@JonathanGarro one thing re Appeal Coverage - are you meaning Appeals only or Appeals and DREFs? In particular I'm not sure how GO handles (or whether in fact it differentiates) between DREF grants and DREF loans. DREFs always show as 100% funded as they come out of the DREF pot, but it's just that some need to be repaid to the DREF pot and some don't.

@nanometrenat just EAs