Closed baszoetekouw closed 7 months ago
Maybe this is a bug in pyff? It can read the xsd file correctly, of course, so maybe that's what the fail_on_error
is checking?
Curiously, this same call failed on test:
Mar 12 15:00:13 app1-tf1 systemd[1]: pyff-metadata.service: Consumed 13.169s CPU time.
Mar 12 16:00:00 app1-tf1 systemd[1]: Starting pyFF Metadata processing...
Mar 12 16:00:00 app1-tf1 pyff-metadata[2051984]: Processing '/opt/metadata/idps_feed.fd'
Mar 12 16:00:11 app1-tf1 pyff-metadata[2051987]: WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fdc337c4190>, 'Connection to metadata.test.surfconext.nl timed out. (connect timeout=10)')': /signed/2023/idps-metadata.xml
Mar 12 16:00:12 app1-tf1 pyff-metadata[2051987]: WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fdc337c42e0>: Failed to establish a new connection: [Errno 111] Connection refused')': /signed/2023/idps-metadata.xml
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fdc337c4460>: Failed to establish a new connection: [Errno 111] Connection refused')': /signed/2023/idps-metadata.xml
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: WARNING:pyff.fetch:error fetching https://metadata.test.surfconext.nl/signed/2023/idps-metadata.xml
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: WARNING:pyff.fetch:HTTPSConnectionPool(host='metadata.test.surfconext.nl', port=443): Max retries exceeded with url: /signed/2023/idps-metadata.xml (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fdc337c4f10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: ERROR:pyff.builtins:<string>:0:0:ERROR:SCHEMASV:SCHEMAV_ELEMENT_CONTENT: Element '{urn:oasis:names:tc:SAML:2.0:metadata}EntitiesDescriptor': Missing child element(s). Expected is one of ( {urn:oasis:names:tc:SAML:2.0:metadata}Extensions, {urn:oasis:names:tc:SAML:2.0:metadata}EntityDescriptor, {urn:oasis:names:tc:SAML:2.0:metadata}EntitiesDescriptor ).
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: ERROR:pyff.pipes:Got exception when loading/executing pipe: XML schema validation failed
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: Traceback (most recent call last):
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/builtins.py", line 534, in publish
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: validate_document(req.t)
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/utils.py", line 279, in validate_document
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: schema().assertValid(t)
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "src/lxml/etree.pyx", line 3643, in lxml.etree._Validator.assertValid
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: lxml.etree.DocumentInvalid: Element '{urn:oasis:names:tc:SAML:2.0:metadata}EntitiesDescriptor': Missing child element(s). Expected is one of ( {urn:oasis:names:tc:SAML:2.0:metadata}Extensions, {urn:oasis:names:tc:SAML:2.0:metadata}EntityDescriptor, {urn:oasis:names:tc:SAML:2.0:metadata}EntitiesDescriptor ).
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: During handling of the above exception, another exception occurred:
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: Traceback (most recent call last):
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/bin/pyff", line 8, in <module>
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: sys.exit(main())
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/md.py", line 33, in main
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: plumbing(p).process(md, state={'batch': True, 'stats': {}})
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/pipes.py", line 363, in process
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: return Plumbing.Request(
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/pipes.py", line 301, in process
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: return pl.iprocess(self)
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/pipes.py", line 333, in iprocess
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: raise ex
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/pipes.py", line 323, in iprocess
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: ot = pipefn(req, *opts)
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: File "/opt/metadata/pyff-env/lib/python3.9/site-packages/pyff/builtins.py", line 537, in publish
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: raise PipeException("XML schema validation failed")
Mar 12 16:00:14 app1-tf1 pyff-metadata[2051987]: pyff.pipes.PipeException: XML schema validation failed
Mar 12 16:00:14 app1-tf1 systemd[1]: pyff-metadata.service: Main process exited, code=exited, status=1/FAILURE
Mar 12 16:00:14 app1-tf1 systemd[1]: pyff-metadata.service: Failed with result 'exit-code'.
Mar 12 16:00:14 app1-tf1 systemd[1]: Failed to start pyFF Metadata processing.
Mar 12 16:00:14 app1-tf1 systemd[1]: pyff-metadata.service: Consumed 1.052s CPU time.
But apparently for the wring reason (pyff was still trying to interpret the xml).
Can't reproduce on --container
deploy of pyff locally?
At the moment the upstream server is back up again, so that's to be expected. Does it also fail correctly if the upstream server times out?
I extended the tests and still can't reproduce with ConnectTimeoutError nor NewConnectionError (refused) exceptions?
Ok, let's leave it for now, and see if this error ever resurfaces...
I want to get rid of pyff anyway, as it is extremely wasteful with memory; should be easily replaceable by a shell script.
in
/opt/metadata/idps_feed.fd
we explicitly haveHowever, if upstream is broken, pyff will simply ignore that source and continue, instead of exiting, causing out metadata feed to be broken:
(note that 7 IdPs should have been written to idps.xml instead of 2)