digital-land / submit

0 stars 1 forks source link

Spike: Ensure issues are gathered from all active resources #600

Closed GeorgeGoodall-GovUk closed 3 days ago

GeorgeGoodall-GovUk commented 3 weeks ago

Problem statement

at the moment we are only looking at the latest resource when getting our issues. we need to ensure we look at all active resources instead

Background

the Issues table now logs entities against it! As entities are constructed from only the most recent facts across all active resources we can use this to better obtain and display the correct relevant issues. We will likely still need to query the endpoints to obtain endpoint errors / update dates

Spike Goals

Findings

issue to entity link

though this is extremely helpful, on the face of it, there's no way to know if an issue has been resolved in a later resource. for example, this query shows multiple issues for the same entity and field. how do we know what issue is the more recent one? what if an upload exists that fixes the issue? We need some way of knowing if an issue is still relevant?

It feels like the end_date column should do this? but it appears to be blank in all rows, this ideally would be populated and set when the issue is fixed in a later file. without that we will have to do some complex stuff regarding searching the facts for and issues for some resources that we pass in statically prioritized by date.

bellow I outline how we can currently achieve receiving data from all active resources without having the database directly tell us what issues are relevant to the live data

Overview page

Currently only fetching data from the latest resource behind the latest endpoint instead of the latest resources behind all active endpoints

dataset task list

Currently only fetching data from the latest resource behind the latest endpoint instead of the latest resources behind all active endpoints

dataset details

This looks to be mostly ok! we are currently getting all active endpoints when displaying the endpoints and status's However it seems that the value used for the task list tab task count is only considering issues taken from the latest endpoint, we should modify this query in a similar way to what is set out in the dataset task list todo above so that this value is correct.

issue details

in this we currently fetch all issues against a specific type and field, against the latest endpoint.

New Tickets

GeorgeGoodall-GovUk commented 3 weeks ago

@CharliePatterson, @alextea So for this ticket, we need to look at the design of the overview page. for an organization, a specific dataset can have more than one resource. one could be erroring. the other could be fine? then also a third might have issues in it? what status do we show if that's the case? I'd imagine all three, but the design needs to be able to facilitate this?

GeorgeGoodall-GovUk commented 3 weeks ago

So just to add a little more context. a dataset could have one endpoint with issues and one endpoint with errors. perhaps we should be giving them more info about each individual file/endpoint they have supplied instead of their dataset as a whole?

Image Image

In addition to that, the tips on each dataset card should probably be changed for those with multiple endpoints. for example look at those with errors here:

Image

Preview link also available for this one as i've started adding it into this ticket https://submit-pr-603.herokuapp.com/organisations/local-authority:LBH

GeorgeGoodall-GovUk commented 3 weeks ago

the dataset overview page could be rethought to consider each endpoint more. here's a little play I had in the browser that adds more information for each endpoint

Note that perhaps we should have a table view for each file/endpoint they have supplied?

the tasklist should also really be split up by each file / endpoint supplied. as the check tool only allows for one endpoint / file at a time.

Image

GeorgeGoodall-GovUk commented 3 weeks ago

Blocking this for now as spoke with Charlie and this needs some more thought, Charlie is going to update us on monday

GeorgeGoodall-GovUk commented 1 week ago

this can stay blocked until issues have entities tracked against them