Closed OIDF-automation closed 2 weeks ago
oh yes, I know this topic very well since an advanced listing endpoint was proposed in the following issue:
https://bitbucket.org/openid/connect/issues/1382/proposal-of-an-improved-federation-api
I’m aware that web (RESTful) developers are very sensible to these aspects.
I think there are some clear benefits to keeping responses a bit smaller and controlled but also there are some (theoretical) limits to the current design where some technologies (such as cloud services) limit the response size. This limitation is, of course, mostly conceptual as you’d have to be serving an ungodly number of entity IDs to hit them…
I do like the format proposed in the issue you link
The SCIM experience is that pagination is a source of complexity. Handing the case where the data set changed between paginated calls is non-trivial.
Yes, we’ve had this issue come up in one implementation with an endpoint with a similar style of functionality to the federation list. In that particular scenario, the ecosystem operator simply accepted it as a known problem especially as the risk of missing a change in data is also present with a one-time grab too. A potential counter to that could be the use of cursor-based pagination? Given each entity identifier must be globally unique anyway this would mitigate the problem of duplicate records
Below based on (https://jsonapi.org/profiles/ethanresnick/cursor-pagination/)
200 OK
Content-Type: application/json
{
"links": {
"prev": "/list?page[before]=https%3A%2F%2Fntnu.andreas.labs.uninett.no%2F&page[size]=2",
"next": "/list?page[after]=https%3A%2F%2Fblackboard.ntnu.no%2Fopenid%2Fcallback&page[size]=2"
},
"content": [
"https://ntnu.andreas.labs.uninett.no/",
"https://blackboard.ntnu.no/openid/callback"
]
}
I do believe this is a problem worth attempting to solve (or at least allowing for implementations to optionally solve) given the sheer size this list can grow to
Hey Michael,
we have two options:
1. create an optional endpoint in the current federation specs
2. create a separate draft to extend the current federation specs with this additional endpoint
let’s see what other authors and you think about this.
Or we have a third option:
3. Wait for feedback from actual deployments about whether pagination is needed in practice.
This was discussed by the editors on Feb 1, 2024. We agreed to wait for feedback from actual deployments about whether pagination is needed in practice.
If it's needed, this capability can be added later in a non-breaking way.
Off the back of some discussion we had at IETF 119 and to add more context to this issue, this has been raised off of the back of challenges that Federations we are implementing in Australia and Brazil will face. In their respective models, the Federation is a very flat structure with a single trust anchor / intermediate issuing statements for a few thousand entities. The large number of entries this leads to in the list endpoint has been the driver behind this issue
We also discussed the challenges that pagination poses with regard to data updates mid-retrieval. It was suggested that the fetch endpoint could be used to filter out inconsistencies here
Following our discussion in Rome at the OSW, I'd like to present some insights regarding the proposal to introduce an optional advanced listing endpoint featuring pagination among other enhancements.
In federations adopting a star topology without intermediaries, the subordinate entity listing endpoint may need to accommodate more than 16K entities. This scenario necessitates pagination.
The primary challenge with pagination is the potential for inconsistent results in non-transactional datasets. For instance, if Giuseppe requests Page 2 while an entity from Page 1 is removed, the results may not align. A proposed workaround involves tracking the total_entries
count to detect changes in the number of entries. However, this approach falls short in scenarios where one entity is removed as another is added, keeping the total_entries
count unchanged despite a variation in the dataset.
To address this, a top-level claim indicating changes in the dataset is necessary. If this claim alters while navigating through pages, it would signal a change in the dataset, prompting the requester to restart the pages fetching process from the beginning.
This endpoint is designed to complement, not replace, the existing subordinate listing endpoint, which remains mandatory. Features of the advanced listing endpoint include:
With the inclusion of optional claims per entity, implementers seeking to provide comprehensive data can do so efficiently. For example, they could offer multiple subordinate statements in one go, maintaining consistency in the dataset_iat
unless there's a change in the dataset's composition.
{
"iss": "https://trust-anchor.star-federation.example.org",
"dataset_iat": 1713456341,
"immediate_subordinates": [
{
"https://rp1.example.com/oidc/rp": {
"registered": 1704217688,
"updated": 1704217688,
"subordinate_statement": "eyJ0eXAiOiJlbnRpdHktc3RhdGVtZW50K2p3dCIsImFsZyI6IlJTMjU2Iiwia2lkIjoiQlh2ZnJsbmhBTXVIUjA3YWpVbUFjQlJRY1N6bXcwY19SQWdKbnBTLTlXUSJ9.eyJleHAiOjE3MTM2MjkxNDEsImlhdCI6MTcxMzQ1NjM0MSwiaXNzIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwIiwic3ViIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL29pZGMvcnAiLCJqd2tzIjp7ImtleXMiOlt7Imt0eSI6IlJTQSIsImUiOiJBUUFCIiwibiI6InBuX0ljaEM2NlNGUU1oYlRITHRiRDU4aktpWVl2WW83UzR3alBqekVDTXUyN2M2RkpWRk5YdGx1YnRiN3NDNi1XVFExSHY0clNRZFBoYWZKYkl4YTMyUjUxc1JRcGtUcjNKRk1ZUDd4MjJEUlFEX2l4dFFKUmFpSHctbnBuWjhxZ1ZISl90NGdSVGM0SEprZWhCTEd2NC1ySFZBS3pGaVFOVTF1MkFGdzFmV01uTUg0b2JfcHlpc1hWZ2NrdTNkeTE0bDdzWVNBTmxwWHVmWV9xbmtRRlR2MHdNSC1DNkl6bC1ha0VOUzJVSHB2VExoZkNCVktQckZYSnh1bDRYRGJVd1Vidk5aVXhUZXJuRXg4bFY1Z3hDU2dLU0JFZ29IOU1ncEQxWVdGUGJBbndpN3A3ZTdNTkd6NWxIN2VERktrUFFoWExXQUJVOFV2RUlJV3lBOTVTUSIsImtpZCI6Ims1NEhRdERpYnlHY3M5WldWTWZ2aUhmLTJxTGNGVXRwd1kycmd4Qms4OE0ifV19LCJtZXRhZGF0YV9wb2xpY3kiOnsib3BlbmlkX3JlbHlpbmdfcGFydHkiOnsic2NvcGUiOnsic3VwZXJzZXRfb2YiOlsib3BlbmlkIl0sInN1YnNldF9vZiI6WyJvcGVuaWQiLCJvZmZsaW5lX2FjY2VzcyIsInByb2ZpbGUiLCJlbWFpbCJdfSwiY29udGFjdHMiOnsiYWRkIjpbImNpYW9AZW1haWwuaXQiXX19fSwic291cmNlX2VuZHBvaW50IjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL2ZldGNoIiwidHJ1c3RfbWFya3MiOlt7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXB1YmxpYyIsInRydXN0X21hcmsiOiJleUowZVhBaU9pSjBjblZ6ZEMxdFlYSnJLMnAzZENJc0ltRnNaeUk2SWxKVE1qVTJJaXdpYTJsa0lqb2lRbGgyWm5Kc2JtaEJUWFZJVWpBM1lXcFZiVUZqUWxKUlkxTjZiWGN3WTE5U1FXZEtibkJUTFRsWFVTSjkuZXlKcGMzTWlPaUpvZEhSd09pOHZNVEkzTGpBdU1DNHhPamd3TURBaUxDSnpkV0lpT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQXZiMmxrWXk5eWNDSXNJbWxoZENJNk1UY3hNelExTmpNME1Td2lhV1FpT2lKb2RIUndjem92TDNkM2R5NXpjR2xrTG1kdmRpNXBkQzlqWlhKMGFXWnBZMkYwYVc5dUwzSndJaXdpYldGeWF5STZJbWgwZEhCek9pOHZkM2QzTG1GbmFXUXVaMjkyTG1sMEwzUm9aVzFsY3k5amRYTjBiMjB2WVdkcFpDOXNiMmR2TG5OMlp5SXNJbkpsWmlJNkltaDBkSEJ6T2k4dlpHOWpjeTVwZEdGc2FXRXVhWFF2YVhSaGJHbGhMM053YVdRdmMzQnBaQzF5WldkdmJHVXRkR1ZqYm1samFHVXRiMmxrWXk5cGRDOXpkR0ZpYVd4bEwybHVaR1Y0TG1oMGJXd2lmUS5DdWVNTm53TG9SNWlqZ1hpUnRWWlkwU1ZCMWFhNGh6Yk5HRWxvR0ZDa1JBaE1zcTZXNVVxMXFidHFRcHVzczBLWE1EX254WEthandIT3BfT2x6a0ctWWNMdjRSeTUwbTROYW1GVUpRckQzYWlxVHFCR09BNXkyUVhJUFhwa2lzNUN3OVhyTko2ZUcyUXN5MFFhc1FfazZ1N05rTGFUUTgwYUJqcHdVX0YtaUdzV3dpLS1Yc1g5Q1Z0VC1yRHJuWUdYbnFwUnNWRlQzUHU1blNJZzhzVEU2bWRTS3lZN0F2MjBUNU5SVlRKcnBrVzZ5UDhBMktpR1JCeUFiYVVickZtQ0c1NGlpUlNPQVRFMmxMbTV1RW16bUJyVzcwTVlhTWpQUmRGemJlNGhPbzV2UTJSZHlwUXNWLUFtNzI0bWNHaHl1R0N6MWk4emMxMFVrLXVpbkkyOFEifSx7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXByaXZhdGUiLCJ0cnVzdF9tYXJrIjoiZXlKMGVYQWlPaUowY25WemRDMXRZWEpySzJwM2RDSXNJbUZzWnlJNklsSlRNalUySWl3aWEybGtJam9pUWxoMlpuSnNibWhCVFhWSVVqQTNZV3BWYlVGalFsSlJZMU42Ylhjd1kxOVNRV2RLYm5CVExUbFhVU0o5LmV5SnBjM01pT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQWlMQ0p6ZFdJaU9pSm9kSFJ3T2k4dk1USTNMakF1TUM0eE9qZ3dNREF2YjJsa1l5OXljQ0lzSW1saGRDSTZNVGN4TXpRMU5qTTBNU3dpYVdRaU9pSm9kSFJ3Y3pvdkwzZDNkeTV6Y0dsa0xtZHZkaTVwZEM5alpYSjBhV1pwWTJGMGFXOXVMM0p3TDNCeWFYWmhkR1VpTENKc2IyZHZYM1Z5YVNJNkltaDBkSEJ6T2k4dmQzZDNMbUZuYVdRdVoyOTJMbWwwTDNSb1pXMWxjeTlqZFhOMGIyMHZZV2RwWkM5c2IyZHZMbk4yWnlJc0luSmxaaUk2SW1oMGRIQnpPaTh2Wkc5amN5NXBkR0ZzYVdFdWFYUXZhWFJoYkdsaEwzTndhV1F2YzNCcFpDMXlaV2R2YkdVdGRHVmpibWxqYUdVdGIybGtZeTlwZEM5emRHRmlhV3hsTDJsdVpHVjRMbWgwYld3aWZRLkxNbnBhcTRubWJVbkpQYllhNHNrU25OUk5DV0VHSi1xbUhpUDR6cVoxcW4tWmNtaXVjb0ZIR1VVMU44RDQyd3RiRXN0TEttMTJPY0xaMk43N1NRMHRMMnQ3NFF0ZF8xV3Y2VzFaaEVoUlZ3dWVLMVZCS0F0SXR1YXM1a2RwR1oxcHRHRUJDQklBSWVGaGQwS3BlOXRIMGpZRnFBbEQ5b0k5cFdrR2xIcEp1SFoweEI5LU03dHRuRl9HSGUwSFZNcmZoOUNZTkxhRHFXdDRsQko2bDBMOWU2eDl6T3YzRllMSUJTdTdTWmE5VTJReDBtdEtWQ3A4VnhKSEMyN3Rfa0dZX1FMaGcxRFFMUTB3SGpON2o1MDZHeEJ3TEVlTVlDVERwYlZkWWN1ZG5ZVzBNRkViaVNRdnFPMGZiX2RVVTNZM2tWOGJQdnNCSnhfQ2xCalphenY3QSJ9XX0.ZB2ClwQ9zbGzwoXebHyzpd9yVGjTV_mk-183q31SY6sI47iHNMNApgz_a2TvfR2U6qzvfysP412reBUDYp1P5c4KG4eVAH-LBlE9tDq9iZc4kNi2AT_GX83APGHh10IF2_HVF6kr7c0scwcObn7rCmv4dF_ca49UCtRhqjDnxltDfcMSOx-M5zriKJycqpURJ28pVX0ZX1Jzu_MM3iwen4xzPfkJG_U2Tk-JjqQnpsAtIYiaqdAsIldvz3AX77GRVIVX1UuAMu_mW607FELOzRn_-rH4XLWdCL2gl9dXfda4yMpweOpKbiIto30xLhH0oyCXkqlfYlkfDuoYFqo5TQ"
}
},
{
"https://rp2.example.edu/rp": {
"registered": 1704215688,
"updated": 1704216688,
"revoked": 1704217688,
"revocation_reason": "..."
}
}
],
"trust_marked_entities": { ... },
"page": 1,
"total_pages": 1,
"total_entries": 2,
"next_page_path": "",
"prev_page_path": ""
}
WDYT?
I very much like this solution - it would both enable the solving of this issue here as well as https://bitbucket.org/openid/connect/issues/2145/additional-filtering-options-in-the
As discussed at OSW I think we’d have to bring some guidance as to what flavour of pagination to bring for interop purposes (thinking cursor vs page). What's your vision for the trust_marked_entities in the above? A minor issue but I’d be hesitant to have additional non-metadata style data outside of the paged response itself
When it comes to the fields available to be added under each entity, my gut is we should define a list of allowed values… whether that forms the list of metadata options already defined in the spec anyway plus some additional such as “registered”, “updated”, etc
I’ll have a crack at implementing this as a test in our own implementation and see how it goes
yes, let’s try to address https://bitbucket.org/openid/connect/issues/2145/additional-filtering-options-in-the in this adv listing endpoint.
regarding the trust_marked_entities
: optional member, it can be considered completely out of scope, or not. It’s up to our discussion. Probably this would have more sense:
{
"https://rp1.example.com/oidc/rp": {
"registered": 1704217688,
"updated": 1704217688,
"trust_marks": [{...},{...}],
"subordinate_statement": "eyJ0eXAiOiJlbnRpdHktc3RhdGVtZW50K2p3dCIsImFsZyI6IlJTMjU2Iiwia2lkIjoiQlh2ZnJsbmhBTXVIUjA3YWpVbUFjQlJRY1N6bXcwY19SQWdKbnBTLTlXUSJ9.eyJleHAiOjE3MTM2MjkxNDEsImlhdCI6MTcxMzQ1NjM0MSwiaXNzIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwIiwic3ViIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL29pZGMvcnAiLCJqd2tzIjp7ImtleXMiOlt7Imt0eSI6IlJTQSIsImUiOiJBUUFCIiwibiI6InBuX0ljaEM2NlNGUU1oYlRITHRiRDU4aktpWVl2WW83UzR3alBqekVDTXUyN2M2RkpWRk5YdGx1YnRiN3NDNi1XVFExSHY0clNRZFBoYWZKYkl4YTMyUjUxc1JRcGtUcjNKRk1ZUDd4MjJEUlFEX2l4dFFKUmFpSHctbnBuWjhxZ1ZISl90NGdSVGM0SEprZWhCTEd2NC1ySFZBS3pGaVFOVTF1MkFGdzFmV01uTUg0b2JfcHlpc1hWZ2NrdTNkeTE0bDdzWVNBTmxwWHVmWV9xbmtRRlR2MHdNSC1DNkl6bC1ha0VOUzJVSHB2VExoZkNCVktQckZYSnh1bDRYRGJVd1Vidk5aVXhUZXJuRXg4bFY1Z3hDU2dLU0JFZ29IOU1ncEQxWVdGUGJBbndpN3A3ZTdNTkd6NWxIN2VERktrUFFoWExXQUJVOFV2RUlJV3lBOTVTUSIsImtpZCI6Ims1NEhRdERpYnlHY3M5WldWTWZ2aUhmLTJxTGNGVXRwd1kycmd4Qms4OE0ifV19LCJtZXRhZGF0YV9wb2xpY3kiOnsib3BlbmlkX3JlbHlpbmdfcGFydHkiOnsic2NvcGUiOnsic3VwZXJzZXRfb2YiOlsib3BlbmlkIl0sInN1YnNldF9vZiI6WyJvcGVuaWQiLCJvZmZsaW5lX2FjY2VzcyIsInByb2ZpbGUiLCJlbWFpbCJdfSwiY29udGFjdHMiOnsiYWRkIjpbImNpYW9AZW1haWwuaXQiXX19fSwic291cmNlX2VuZHBvaW50IjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL2ZldGNoIiwidHJ1c3RfbWFya3MiOlt7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXB1YmxpYyIsInRydXN0X21hcmsiOiJleUowZVhBaU9pSjBjblZ6ZEMxdFlYSnJLMnAzZENJc0ltRnNaeUk2SWxKVE1qVTJJaXdpYTJsa0lqb2lRbGgyWm5Kc2JtaEJUWFZJVWpBM1lXcFZiVUZqUWxKUlkxTjZiWGN3WTE5U1FXZEtibkJUTFRsWFVTSjkuZXlKcGMzTWlPaUpvZEhSd09pOHZNVEkzTGpBdU1DNHhPamd3TURBaUxDSnpkV0lpT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQXZiMmxrWXk5eWNDSXNJbWxoZENJNk1UY3hNelExTmpNME1Td2lhV1FpT2lKb2RIUndjem92TDNkM2R5NXpjR2xrTG1kdmRpNXBkQzlqWlhKMGFXWnBZMkYwYVc5dUwzSndJaXdpYldGeWF5STZJbWgwZEhCek9pOHZkM2QzTG1GbmFXUXVaMjkyTG1sMEwzUm9aVzFsY3k5amRYTjBiMjB2WVdkcFpDOXNiMmR2TG5OMlp5SXNJbkpsWmlJNkltaDBkSEJ6T2k4dlpHOWpjeTVwZEdGc2FXRXVhWFF2YVhSaGJHbGhMM053YVdRdmMzQnBaQzF5WldkdmJHVXRkR1ZqYm1samFHVXRiMmxrWXk5cGRDOXpkR0ZpYVd4bEwybHVaR1Y0TG1oMGJXd2lmUS5DdWVNTm53TG9SNWlqZ1hpUnRWWlkwU1ZCMWFhNGh6Yk5HRWxvR0ZDa1JBaE1zcTZXNVVxMXFidHFRcHVzczBLWE1EX254WEthandIT3BfT2x6a0ctWWNMdjRSeTUwbTROYW1GVUpRckQzYWlxVHFCR09BNXkyUVhJUFhwa2lzNUN3OVhyTko2ZUcyUXN5MFFhc1FfazZ1N05rTGFUUTgwYUJqcHdVX0YtaUdzV3dpLS1Yc1g5Q1Z0VC1yRHJuWUdYbnFwUnNWRlQzUHU1blNJZzhzVEU2bWRTS3lZN0F2MjBUNU5SVlRKcnBrVzZ5UDhBMktpR1JCeUFiYVVickZtQ0c1NGlpUlNPQVRFMmxMbTV1RW16bUJyVzcwTVlhTWpQUmRGemJlNGhPbzV2UTJSZHlwUXNWLUFtNzI0bWNHaHl1R0N6MWk4emMxMFVrLXVpbkkyOFEifSx7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXByaXZhdGUiLCJ0cnVzdF9tYXJrIjoiZXlKMGVYQWlPaUowY25WemRDMXRZWEpySzJwM2RDSXNJbUZzWnlJNklsSlRNalUySWl3aWEybGtJam9pUWxoMlpuSnNibWhCVFhWSVVqQTNZV3BWYlVGalFsSlJZMU42Ylhjd1kxOVNRV2RLYm5CVExUbFhVU0o5LmV5SnBjM01pT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQWlMQ0p6ZFdJaU9pSm9kSFJ3T2k4dk1USTNMakF1TUM0eE9qZ3dNREF2YjJsa1l5OXljQ0lzSW1saGRDSTZNVGN4TXpRMU5qTTBNU3dpYVdRaU9pSm9kSFJ3Y3pvdkwzZDNkeTV6Y0dsa0xtZHZkaTVwZEM5alpYSjBhV1pwWTJGMGFXOXVMM0p3TDNCeWFYWmhkR1VpTENKc2IyZHZYM1Z5YVNJNkltaDBkSEJ6T2k4dmQzZDNMbUZuYVdRdVoyOTJMbWwwTDNSb1pXMWxjeTlqZFhOMGIyMHZZV2RwWkM5c2IyZHZMbk4yWnlJc0luSmxaaUk2SW1oMGRIQnpPaTh2Wkc5amN5NXBkR0ZzYVdFdWFYUXZhWFJoYkdsaEwzTndhV1F2YzNCcFpDMXlaV2R2YkdVdGRHVmpibWxqYUdVdGIybGtZeTlwZEM5emRHRmlhV3hsTDJsdVpHVjRMbWgwYld3aWZRLkxNbnBhcTRubWJVbkpQYllhNHNrU25OUk5DV0VHSi1xbUhpUDR6cVoxcW4tWmNtaXVjb0ZIR1VVMU44RDQyd3RiRXN0TEttMTJPY0xaMk43N1NRMHRMMnQ3NFF0ZF8xV3Y2VzFaaEVoUlZ3dWVLMVZCS0F0SXR1YXM1a2RwR1oxcHRHRUJDQklBSWVGaGQwS3BlOXRIMGpZRnFBbEQ5b0k5cFdrR2xIcEp1SFoweEI5LU03dHRuRl9HSGUwSFZNcmZoOUNZTkxhRHFXdDRsQko2bDBMOWU2eDl6T3YzRllMSUJTdTdTWmE5VTJReDBtdEtWQ3A4VnhKSEMyN3Rfa0dZX1FMaGcxRFFMUTB3SGpON2o1MDZHeEJ3TEVlTVlDVERwYlZkWWN1ZG5ZVzBNRkViaVNRdnFPMGZiX2RVVTNZM2tWOGJQdnNCSnhfQ2xCalphenY3QSJ9XX0.ZB2ClwQ9zbGzwoXebHyzpd9yVGjTV_mk-183q31SY6sI47iHNMNApgz_a2TvfR2U6qzvfysP412reBUDYp1P5c4KG4eVAH-LBlE9tDq9iZc4kNi2AT_GX83APGHh10IF2_HVF6kr7c0scwcObn7rCmv4dF_ca49UCtRhqjDnxltDfcMSOx-M5zriKJycqpURJ28pVX0ZX1Jzu_MM3iwen4xzPfkJG_U2Tk-JjqQnpsAtIYiaqdAsIldvz3AX77GRVIVX1UuAMu_mW607FELOzRn_-rH4XLWdCL2gl9dXfda4yMpweOpKbiIto30xLhH0oyCXkqlfYlkfDuoYFqo5TQ"
}
}
When it comes to the fields available to be added under each entity, my gut is we should define a list of allowed values… whether that forms the list of metadata options already defined in the spec anyway plus some additional such as “registered”, “updated”, etc
I agree with you, we should define a know set of members and leave up to the implementers to add any other claim, as we already done with the trust mark schema.
I’d also add the top level member entries_per_page
since it is actively discussed, I realized that "on hold" doesn't bring it in the issues list, actually hiding it
I’ve implemented a PoC of the above on our Federation implementation internally and so far it's been good to work with. I just did a basic set of query parameters:
I also included all of the parameters from the original list endpoint (entity_type, trust_marked, trust_mark_id). The only thing that jumped out at me is that entity_type should really become mandatory when using the format above. It not being mandatory combined with the fact that the type of additional “keys” one can expect in the response maps changes depending on the subject entity type did add complexity. This would be solved with trust_marked becoming mandatory thus you know what sort of format to expect
Hi, registering my support / need for paginated responses due to the size of entities within our federations. 1000+. i’d also like to flag that in some interactions ecosystems where the list response contains very large datasets we would want to deny the use of the ‘list’ endpoint and require appropriate use of the advanced filtering endpoint / api. I’m not going to want to have federation participants downloading MB’s of information when a more advanced, smaller payload perhaps one with a default filter of ‘last 24 hours worth of changes' etc added too it.
To be clear, the listing endpoint returns Entity Identifiers (URLs) - not Entity Configurations (JWTs). 1000 Entity Identifiers is likely to be about 20-30K of data - not megabytes. It’s only once people start retrieving the corresponding Entity Configurations that you’ll reach megabytes or possibly tens of megabytes.
Don’t get me wrong - I’m in favor of the ability to list useful subsets of immediate subordinate entities. But I’m trying to have us be precise about what the operations being discussed do.
My ask is for those who have implemented prototypes and/or who have these ecosystems needs to say exactly what query parameters they want to use on the listing endpoint and what their meanings are.
For instance, if queries select based on “changed since” information, what kinds of changes qualify? Key changes? Joining the federation as an immediate subordinate? What else? Do you want responses for former subordinates that are no longer part of the federation? What kinds of state are you asking that immediate superiors track about their immediate subordinates, and how stale is that information allowed to be in query responses?
Ralph, I appreciate your participation. The more specific you can be about what you need when and why, the more actionable the information will be. Thanks.
We’ve been in discussion recently with a party whose ecosystem would have upwards of 50k participants. In such a scenario the list endpoint would be between 5 and 10 MB
To your point above Mike in the Australian context for the current proprietary API that we’re hoping to ditch in favour of the advanced listing endpoint for interop purposes, “changed since” has been taken as anything that would produce a change in the information that an Intermediate or Trust Anchor will issue in their entity statements for their immediate subordinates
I can only echo what Ralph and Michael wrote. I’ll add that even if there’s an optional advanced endpoint that supports limits and filtering, exposing publicly listAll endpoint with anonymous access that may return couple megabytes of data may be asking for trouble.
I think we should consider seek pagination for this endpoint. Not only because of the performance advantage when it comes to large data sets, but also because it has no impact on the response format.
It would require adding two request parameters:
after_entity_id
OPTIONAL. The value of this parameter is an Entity Identifier.
If this parameter after_entity_id is present then the result
list MUST be filtered to include only these Entity Identifiers
that, in the results list, immediately follow the Entity Identifier
that is value of this parameter. If the Entity Identifier that equals
value of this parameter does not exist it MUST use the HTTP status code 400
and the content type application/json, with the error code entity_id_not_found (TBD).
limit
OPTIONAL. Positive integer that specifies maximum number of Entity Identifiers
included in the results list contained in the response. If this parameter
is not present the result list contained in the response will
include maximum 1000 (thousand) Entity Identifiers.
If the limit parameter is present
the result list MUST be filtered to include not more than
the specified number of Entity Identifiers.
It MUST support values less than 1000. It MAY support
values higher than 1000. If it does not support the
limit value provided in the parameter it MUST use the
HTTP status code 400 and the content type application/json the
error code unsupported_limit_value (TBD)
The original response would remain unchanged
GET /list HTTP/1.1
200 OK
Content-Type: application/json
[
"https://0.openid.net/",
"https://1.openid.net/",
"https://2.openid.net/"
]
If there is more than 1000 (default limit) elements to be returned, the response contains only 1000.
GET /list HTTP/1.1
200 OK
Content-Type: application/json
[
"https://0.openid.net/",
"https://1.openid.net/",
"https://2.openid.net/",
...
"https://999.openid.net/"
]
Since there’s 1000 results the client sends another request with an additional query param to check whether there is more results and fetch them. This approach is a tradeoff and has its pros and cons.
GET /list?after_entity_id=https://999.openid.net
200 OK
Content-Type: application/json
[
"https://1000.openid.net/",
"https://1001.openid.net/",
...
"https://1020.openid.net/""
]
Client may need a smaller page sizes that shall always be supported
GET /list?limit=20
GET /list?after_entity_id=https://1020.openid.net/&limit=20
Or larger page sizes that may be supported and if are not supported such request may end up with an error. It would be good for a client to be able to know the max supported size.
GET /list?limit=10000
If the entity identifier provided in the after_entity_id parameter does not exist (because e.g. was deleted in the meantime) an error is returned. The client needs to start over in this case.
GET /list?after_entity_id=https://23xyz.openid.net/
400 Bad request
Content-Type: application/json
{
"error": "entity_id_not_found",
"error_description":
"TBD"
}
how to resolve the issue represented by any addition/removal of entities while and in between different requests to the list endpoint?
at time T entities A,C, D, E and F are found in the listing endpoint response
at time T+1 the list request of the verifier Z sets the after_entity_id with C and limit set with 2, obtaining D, E
at time T+2 the entity B in registered as an immediate subordinate and therefore available in the list response
at time T+3 the list request of the verifier Z sets the after_entity_id with E and limit set with 2, obtaining F
the chunked responses missed the entity B.
We therefore need a way to hint to the requester that the population of the subordinates is somehow changed. For this reason I have proposed the parameter dataset_uid
here: https://bitbucket.org/openid/connect/pull-requests/732/diff#Lopenid-federation-1_0.xmlT4356
the list endpoint result doesn’t give these additional level of details (unless we don’t put these within the http headers within the response …). For this reason I have proposed another endpoint to handle these detailed requests and resposes, enabling also the provisioning of multiple subordinate statements within the objects pertaining each subordinate entity
Hi Giuseppe. Thank you for looking at this issue.
the chunked responses missed the entity B.
In the referred example, the result will be identical to the result we obtain using the endpoint without pagination at time T. There should be no significant impact on a client as it gets the same list as it would get with original endpoint.
Case 1
T: List:A,C,D,E,F /list?limit=3 result:A,C,D /list result: A,C,D,E,F
T+1 (+B): List:A,B,C,D,E,F /list?limit=3&after_entity_id=D result:E,F
However, there’s also a variation of this case where the modification happens to pages that haven’t been requested yet. Like add operation (+E1):
Case 2
T: List:A,C,D,E,F /list?limit=3 result:A,C,D /list result: A,C,D,E,F
T+1 (+E1): List:A,B,C,D,E,E1,F /list?limit=3&after_entity_id=D result:E,E1,F
or delete operation (-D):
Case 3
T: List:A,C,D,E,F /list?limit=3 result:A,C,D /list result: A,C,D,E,F
T+1 (-E): List:A,B,C,D,E,F /list?limit=3&after_entity_id=D result:F
In these ^^^ cases 2, 3 client gets up-to-date information regarding these pages
Case 4
T: List:A,C,D,E,F /list?limit=3 result:A,C,D /list result: A,C,D,E,F
T+1 (-D): List:A,B,C,E,F /list?limit=3&after_entity_id=D result:E,F OR ERROR
There’s a corner case 4 ^^^ where D gets removed an in this case either forward pages results as returned as usual or an error to signal that D doesn’t exist.
I see where you are going with the dataset_uid
, however do we really need to aim for such strong consistency?
I imagine that data returned by this endpoint even now without pagination is eventually consistent, so issues that you are describing can occur even without pagination especially with use of CQRS or a distributed data store.
I understand from conversations with @peppelinux that full pagination is looking increasingly unlikely. If that's the case, given the current specification could, in theory, hit maximum return size for certain implementations I believe we should at minimum define a 1-liner highlighting the risk with large data sets and recommend an error & error_detail response stating as such
This work is being done in the now-adopted specification https://openid.net/specs/openid-federation-extended-listing-1_0.html .
Imported from AB/Connect bitbucket: https://bitbucket.org/openid/connect/issues/2109
Original Reporter: MichaelFraser99
Under section 8.3 (https://openid.net/specs/openid-federation-1_0.html#name-subordinate-listings) the list endpoint is defined and an issue is flagged inside the spec around the size of the response
I would argue this is something best solved through pagination inside the spec instead of simply just acknowledging it as an issue.
I would propose adding two optional pagination keys to the request (size and page) and adjust the response to the following
I would like to know people’s thoughts on this, thanks