Proposal: Store meta information about catalogue entities.

KlapTrap commented 5 years ago

Currently we store catalogue entities in the following format:

requestData: {
  entityKey: {
      entityguid1: entity1,
      entityguid2: entity2,
      ...
  },
...
}

Proposed change:

requestData: {
 endpointtypeEntitytype: {
     metadata: {
          endpointType: 'endpointtype',
         .... other metadata
     },
     entities: {
       entityguid1: {
         metadata: {
              endpointType: 'endpointtype',
              entityKey: 'endpointtypeEntitytype',
              entityType: 'entitytype',
              schemaKey: 'schemaKey',  // This might be hard to pin down as the entity may be built up of different api calls. Can we just make it 'unknown' if it's modified?
              endpointGuid: 'endpointGuid',
              guid: 'entityguid1' // This would make it easier for lists, maybe?!
          }
         entity: entity1
     },
      entityguid2: {
         metadata: {...},
         entity: entity2
      },
      ...
     }      
  },
...
}

This change would allow us to, for example, automatically clean up the store if an endpoint is disconnect.

We could extend this further and add and other 1:1 entity relations that would let us clean up the store further. Extension would provide this extra entity relations information.

Please read below for more thoughts on this work.

KlapTrap commented 5 years ago

Example:

Clean up after endpoint disconnect

Using the example data above:

The user disconnects an endpoint.
A reducer receives the ENDPOINT_DISCONNECTED action with the endpointType and guid.
The reducer will go through the store finding all entities that have the matching endpointType.
The requestData and request sections of the store for found entities can be removed.
The store is free of orphaned entities.

KlapTrap commented 5 years ago

We could apply this to pagination sections too. As part of the pagination data, an array of endpointTypes and Ids could be stored.

To extend the Clean up after endpoint disconnect example above:

The pagination reducer receives the ENDPOINT_DISCONNECTED action with the endpointType and guid.
The reducer finds all pagination section in the store that have the matching endpointType and guid.
The reducer clears those pagination sections.
Only the paginations section without entities from the corresponding endpoints will be left.

KlapTrap commented 5 years ago

We would update all current requestData selectors to return the same data as they currently do to minimise component code changes.

KlapTrap commented 5 years ago

@richard-cox What do you think?

richard-cox commented 5 years ago

I think it makes sense. Ideally it would be nice to target entities that are associated with the deleted endpoint by including an endpointId as well. Pagination sections could work with an array of endpointIds (given that we have that data in the request). Unless really needed for the current catalogue work though I think we should do this later on.

richard-cox commented 5 years ago

See also src/frontend/packages/store/src/reducers/endpoint-disconnect-application.reducer.ts

KlapTrap commented 4 years ago

The more I think about it the more I think we should have a separate "requestDataMetaData" section of the store to reduce the impact of this work.

KlapTrap commented 4 years ago

We could use the "requestDataMetaData" to drive garage collection of store data. We could store a last updated timestamp and a last active timestamp (when was the last time the user viewed the data). We could look at both the last active and last updated timestamps and work out if we should:

a) Remove the entity from the store if the entity hasn't been active for a while, this will force a re-fetch when the user visits that entity.

b) Mark the entity as out-of-date if the entity is active but and hasn't been updated for a while. In this case we could show a message and prompt the user to re-fresh the entity.

Note: If we remove any entities from the store then we should clear pagination sections for the entity type remove.

KlapTrap commented 4 years ago

I'm also wondering if we can open up the metadata to extensions and use it to simplify the CF validation. We could create a simple fingerprint string that's a combination of what inline entities have been fetch as part of the entity stored in the store. The CF extension would parse this fingerprint and work out if the application should refetch the entity or not.

@richard-cox I haven't given this too much thought but do you think could work?

richard-cox commented 4 years ago

I like it, but there's a lot of implementation there and would push a lot of work + data into the client (there could be thousands of entities).

Garbage Collection - Managing Metadata From my understanding we would need to update each entity's meta data part when...

core entity/entities and inline data pushed to store
data is emitted from the entity service or pagination object's observable

Garbage Collection - GC Run

How old/stale would the entity need to be before being classed as old/stale (hours, days, months)? This would be tricky to determine and would be best if it were configurable.
Clearing out old entities
- We should only clear out pagination sections containing the old element, or at least those lists which pulled from the entity's CF
Marking entities as stale
- There's a far few places this would have to be done shown in order to avoid being confusing (stale in one place but not another)

Validation

Not sure we need to change this wheel, it's working ok at the moment
The fingerprint would need updating when we independently fetch entities as part of other calls (app is fingerprinted without it's org, org is fetched independently, app fingerprint would then need updating to include org)
The validation process will also only fetch parts that are missing rather than the whole entity. It also does things on top like fetching entities that are too 'deep' into the object to be returned by cf, this would require keeping a lot of the current code.

KlapTrap commented 4 years ago

(there could be thousands of entities). This is the reason we've talked about garbage collection in the past and would be why we'd need it at all. If we do it right, GC would help us keep the

How old/stale would the entity need to be before being classed as old/stale (hours, days, months)? This would be tricky to determine and would be best if it were configurable.

I'm sure we could come up with a reasonable default, but adding a config option would be fine.

We should only clear out pagination sections containing the old element, or at least those lists which pulled from the entity's CF

We clear all related pagination sections when we create/delete entities so we should re-use that mechanism. It keeps things simple and ensures we cover all cases.

There's a far few places this would have to be done shown in order to avoid being confusing (stale in one place but not another)

I'm not sure what you mean.

Not sure we need to change this wheel, it's working ok at the moment

Agreed, it is working but it feels like a good time to review it again, much like we're doing with the API effect. I often see a lot of validation cycles that end up with no additional requests being made, this feels like we're being too cautious and putting a lot of effort into making the perfect api call. I am not saying what I proposed is any better but I would be good if we could simplify the validation.

The validation process will also only fetch parts that are missing rather than the whole entity. It also does things on top like fetching entities that are too 'deep' into the object to be returned by cf, this would require keeping a lot of the current code.

How often do we use the validation to fetch entities that are too deep?

My general thoughts around the validation are:

1) It works well when we need it to work but it often becomes a no-op. 2) Can we come up with something that does almost everything the current validation does while being a little simpler?

The answer to 3) may be no, but at least we've given it some more thought.

cloudfoundry / stratos

Proposal: Store meta information about catalogue entities. #3704

Clean up after endpoint disconnect