Closed alejandro-perez closed 1 year ago
Logs from the instance with the "select" filter:
pyff_1 | [2023-03-13 12:35:04 +0000] [13] [DEBUG] GET /%7Bentities/%7Bsha1%7D573116c096bd85296da6c0fd921b9f36dc4c3805.json
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.api GET /%7Bentities/%7Bsha1%7D573116c096bd85296da6c0fd921b9f36dc4c3805.json HTTP/1.1
pyff_1 | Accept: application/json
pyff_1 | Content-Length: 0
pyff_1 | Host: localhost:8080
pyff_1 | User-Agent: curl/7.81.0
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.api match=None
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.api handling entry=request, alias={entities, path={sha1}573116c096bd85296da6c0fd921b9f36dc4c3805.json
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'when update': [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break']}, {'when request': [{'select': None}, {'pipe': [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]}]}]: calling 'when' using args: [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break'] and opts: ['update']
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'when update': [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break']}, {'when request': [{'select': None}, {'pipe': [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]}]}]: calling 'when' using args: [{'select': None}, {'pipe': [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]}] and opts: ['request']
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'select': None}, {'pipe': [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]}]: calling 'select' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:04 INFO pyff.builtins selecting using args: ['{sha1}573116c096bd85296da6c0fd921b9f36dc4c3805']
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.store calling store lookup {sha1}573116c096bd85296da6c0fd921b9f36dc4c3805
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.samlmd selecting 1 entities before validation
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.samlmd Filtering invalids from mdx
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'select': None}, {'pipe': [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]}]: calling 'pipe' using args: [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}] and opts: []
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'when accept application/json': [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']}]: calling 'when' using args: [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break'] and opts: ['accept', 'application/json']
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.pipes [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']: calling 'select' using args: ['!//md:EntityDescriptor[md:IDPSSODescriptor]'] and opts: []
pyff_1 | 2023-03-13 12:35:04 INFO pyff.builtins selecting using args: ['!//md:EntityDescriptor[md:IDPSSODescriptor]']
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.store calling store lookup entities
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.store filtering 9399 entities using xpath //md:EntityDescriptor[md:IDPSSODescriptor]
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.samlmd selecting 9399 entities before validation
pyff_1 | 2023-03-13 12:35:04 DEBUG pyff.samlmd Filtering invalids from dummy
pyff_1 | 2023-03-13 12:35:05 DEBUG pyff.store got 5283 entities after filtering
pyff_1 | 2023-03-13 12:35:05 DEBUG pyff.samlmd selecting 5283 entities before validation
pyff_1 | 2023-03-13 12:35:05 DEBUG pyff.samlmd Filtering invalids from mdx
pyff_1 | 2023-03-13 12:35:06 DEBUG pyff.pipes [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']: calling 'discojson' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:06 DEBUG pyff.pipes [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']: calling 'emit' using args: None and opts: ['application/json']
pyff_1 | 2023-03-13 12:35:06 DEBUG pyff.pipes [{'select': ['!//md:EntityDescriptor[md:IDPSSODescriptor]']}, 'discojson', {'emit application/json': None}, 'break']: calling 'break' using args: None and opts: []
[.........]
Log output for the instance without the select filter:
pyff_1 | [2023-03-13 12:35:59 +0000] [12] [DEBUG] GET /%7Bentities/%7Bsha1%7D573116c096bd85296da6c0fd921b9f36dc4c3805.json
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.api GET /%7Bentities/%7Bsha1%7D573116c096bd85296da6c0fd921b9f36dc4c3805.json HTTP/1.1
pyff_1 | Accept: application/json
pyff_1 | Content-Length: 0
pyff_1 | Host: localhost:8080
pyff_1 | User-Agent: curl/7.81.0
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.api match=None
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.api handling entry=request, alias={entities, path={sha1}573116c096bd85296da6c0fd921b9f36dc4c3805.json
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'when update': [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break']}, {'when request': [{'select': None}, {'pipe': [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]}]}]: calling 'when' using args: [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break'] and opts: ['update']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'when update': [{'load': ['http://metadata.ukfederation.org.uk/ukfederation-metadata.xml']}, 'break']}, {'when request': [{'select': None}, {'pipe': [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]}]}]: calling 'when' using args: [{'select': None}, {'pipe': [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]}] and opts: ['request']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, {'pipe': [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]}]: calling 'select' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:59 INFO pyff.builtins selecting using args: ['{sha1}573116c096bd85296da6c0fd921b9f36dc4c3805']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.store calling store lookup {sha1}573116c096bd85296da6c0fd921b9f36dc4c3805
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.samlmd selecting 1 entities before validation
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.samlmd Filtering invalids from mdx
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, {'pipe': [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]}]: calling 'pipe' using args: [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}] and opts: []
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'when accept application/json': [{'select': None}, 'discojson', {'emit application/json': None}, 'break']}]: calling 'when' using args: [{'select': None}, 'discojson', {'emit application/json': None}, 'break'] and opts: ['accept', 'application/json']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, 'discojson', {'emit application/json': None}, 'break']: calling 'select' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:59 INFO pyff.builtins selecting using args: ['{sha1}573116c096bd85296da6c0fd921b9f36dc4c3805']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.store calling store lookup {sha1}573116c096bd85296da6c0fd921b9f36dc4c3805
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.samlmd selecting 1 entities before validation
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.samlmd Filtering invalids from mdx
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, 'discojson', {'emit application/json': None}, 'break']: calling 'discojson' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, 'discojson', {'emit application/json': None}, 'break']: calling 'emit' using args: None and opts: ['application/json']
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.pipes [{'select': None}, 'discojson', {'emit application/json': None}, 'break']: calling 'break' using args: None and opts: []
pyff_1 | 2023-03-13 12:35:59 DEBUG pyff.api b'[{"title": "UM - University of Murcia", "descr": "The Identity Provider of University of Murcia", "title_langs": {"es": "UM - Universidad de Murcia", "en": "UM - University of Murcia"}, "descr_langs": {"es": "El proveedor de identidad de la Universidad de Murcia", "en": "The Identity Provider of University of Murcia"}, "auth": "saml", "entity_id": "https://www.rediris.es/sir/umidp", "entityID": "https://www.rediris.es/sir/umidp", "type": "idp", "hidden": "false", "scope": "um.es", "domain": "um.es", "name_tag": "UM", "entity_icon_url": {"url": "https://img.sir2.rediris.es/200px-201a27c316f210f42657d783f2ae8fa0.png", "width": "200", "height": "53"}, "keywords": "um,murcia"}]'
The implicit select is essentially "overwritten" by the explicit select in the when clause so returning all 5k entities is actually the correct behavior in that case. I think you are trying to filter the response to only returning IdPs yes? You should look into the filter directive for this usecase I think.
Thanks @leifj . Is there any way in which I can get the functionality I want, where searchs only return IDPs, while grabbing one entity's metadata can return any from the loaded feed? (this is how seamlessaccess.org works, but I couldn't find how to replicate).
Sorry I ask here, but I haven't seen anything in the docs or examples that shed any light on this.
To mimic the example above:
curl "https://md.seamlessaccess.org/entities/%7Bsha1%7Dd6cad1541a6653fa308955d7341b7171bc970f09.json" -H "Accept: application/json"
{"title":"DevTeam Test RPi Box","descr":"DevTeam's Test RPi Box","title_langs":{"en":"DevTeam Test RPi Box"},"descr_langs":{"en":"DevTeam's Test RPi Box"},"auth":"saml","entity_id":"https://rpi.dev.ukfederation.org.uk/shibboleth","entityID":"https://rpi.dev.ukfederation.org.uk/shibboleth","type":"sp","id":"{sha1}d6cad1541a6653fa308955d7341b7171bc970f09"}
But:
curl "https://md.seamlessaccess.org/entities/?query=rpi" -H "Accept: application/json"
[{"title":"Rensselaer Polytechnic Institute","descr":"http://www.rpi.edu/","title_langs":{"en":"Rensselaer Polytechnic Institute"},"descr_langs":{"en":"http://www.rpi.edu/"},"auth":"saml","entity_id":"https://shib-idp.rpi.edu/idp/shibboleth","entityID":"https://shib-idp.rpi.edu/idp/shibboleth","type":"idp","hidden":"false","scope":"rpi.edu","domain":"rpi.edu","name_tag":"RPI","entity_icon_url":{"url":"https://scer.rpi.edu/sites/default/files/logo-without-tag.jpg","width":"673","height":"175"},"privacy_statement_url":"http://scer.rpi.edu/privacypolicy","id":"{sha1}549b3f2200a9f8c5d11808ff74931d68be4d32f8"}]
We don't filter in SA actually but you can achieve the effect you want by adding a filter step after the select - the API docs has an example but doing something like this should work:
- select:
- filter:
- "!//md:EntityDescriptor[md:IDPSSODescriptor]"
From my example above against SA, you can see how results from the /entities/?query=XXX
endpoint do filter, since otherwise DevTeam's Test RPi Box
would have been included in the results.
In any case, I'll try using filter. Where can I find the API docs? https://pyff.readthedocs.io/en/latest/ do not mention anything about filter. Thanks!
Also, if I use filter, the SP details are not returned (cause, it's an SP). SA does not show this behaviour.
Using the following link, you can see how SP's details are fetched (top left corner shows display name), while still the search does not return any SP (only IDPs).
Search is not handled via pipelines in pyFF. The filter clause is documented in the source but maybe readthedocs isn't exposing all the API docs. I will look into that.
Also, if I use filter, the SP details are not returned (cause, it's an SP). SA does not show this behaviour.
As I said above, SA doesn't do filtering - both SPs and IdPs are included in the metadata feed from SA however not all SPs are included in SA because SA only looks at edugain and a few other federation feeds currently so your test box might not be included in one of those feeds. This could be the reason you're not seeing it.
As I said above, SA doesn't do filtering - both SPs and IdPs are included in the metadata feed from SA however not all SPs are included in SA because SA only looks at edugain and a few other federation feeds currently so your test box might not be included in one of those feeds. This could be the reason you're not seeing it.
I'm not sure about that. The name of the test box is rendered properly on the top-left corner (cause it's the requesting SP), but you cannot find it using the search.
Somehow, even though SA is including all entities (because you can see how the SP name is displayed correctly), it manages to filter SPs out of hte search results. Whether that is done at the pyFF level or thiss.io I do not know.
Search is not handled via pipelines in pyFF. The filter clause is documented in the source but maybe readthedocs isn't exposing all the API docs. I will look into that.
Thanks that'll be useful
When filtering by entity type, when I try to get on entity's JSON metadata, I get a response that includes all entities instead.
However, search works fine, and returns the matching entities after filter has been applied.
Code Version
2.0.0 (docker image)
Expected Behavior
Only the entity matching the SHA1 hash is returned.
Current Behavior
All entities are returned (~5k)
Possible Solution
Steps to Reproduce