HEPCloud / decisionengine

HEPCloud Decision Engine framework
Apache License 2.0
6 stars 26 forks source link

Improved error-handling whenever source/channel comes online #669

Open knoepfel opened 1 year ago

knoepfel commented 1 year ago

@namrathaurs ran into the situation where de-client --status indicated sources that were continually in BOOT state, and a channel that was STEADY.

source: FactoryEntriesSource   , queue id = FactoryEntriesSource.06F849BC-66CE-42E2-8873-BC3BEFB97F49   , state = BOOT
source: Factory_Entries_AWS    , queue id = Factory_Entries_AWS.F4F33F98-3C42-413E-8462-C24D909AF8E0    , state = BOOT
source: Factory_Entries_GCE    , queue id = Factory_Entries_GCE.F873ED76-C16B-4FC4-8A37-498901FD5AA6    , state = BOOT
source: Factory_Entries_Grid   , queue id = Factory_Entries_Grid.1B59E324-91C6-4A96-BBE1-13FBE04E0227   , state = BOOT
source: Factory_Entries_LCF    , queue id = Factory_Entries_LCF.C848849D-D671-4B14-8F7F-B1E71F6CA1C0    , state = BOOT
source: FigureOfMerit          , queue id = FigureOfMerit.AC21EC20-9031-470D-97E5-D7036699C733          , state = BOOT
source: GceFigureOfMerit       , queue id = GceFigureOfMerit.B38C3F69-049B-45CF-B5B2-D5E71C6215F9   , state = BOOT
source: NerscFigureOfMerit     , queue id = NerscFigureOfMerit.68E0D108-3BBE-4476-9839-C64A820D0BC3     , state = BOOT
source: StartdManifestsSource  , queue id = StartdManifestsSource.48A0AC79-562C-4C46-8C0D-17E8C3DD8470  , state = BOOT
source: factoryglobal_manifests, queue id = factoryglobal_manifests.6E878D20-65B5-4F04-BD84-047B5380CA9E, state = BOOT
source: jobs_manifests         , queue id = jobs_manifests.FC2E0653-D97D-4EDA-A0FC-87FC47FA8288         , state = BOOT
source: source1                , queue id = source1.C9579CC3-2744-4C6B-80BD-A458912DB518                , state = STEADY

channel: test_channel, id = 5F53B496-F097-41AE-8336-CEFF3DD3B8F2, state = STEADY

reaper: state = IDLE

The actual cause of the hanging was that a channel transform was transitively using a library (PyJWT version 2.5.0) that no longer supported for Python 3.6:

File: "/.../site-packages/PyJWT-2.5.0-py3.6.egg/jwt/api_jwk.py", line 1
    from __future__ import annotations
                                     ^
SyntaxError: future feature annotations is not defined

In this case, the state of the channel should have been ERROR, and the state of all sources should have been brought to OFFLINE.