Our archiver instance (10k active/60k total PVs) started triggering broadcast storms after server upgrade. This happens once at 20 minutes after startup. I traced this to the connection of all the meta channels on first run of DisconnectChecker. Relevant log fragment:
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EngineContext - L5:P4:BPM.BK3 is connected. Seeing if we need to start up the meta channels for the fields.
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - L5:P4:BPM.DESC connected is false
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EngineContext - S35I:DG1:trigInputAmpSetAO is connected. Seeing if we need to start up the meta channels for the fields.
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - S35I:DG1:trigInputAmpSetAO.DRVH connected is false
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EngineContext - L5:P4:BPM.BK0 is connected. Seeing if we need to start up the meta channels for the fields.
2022-02-07 22:51:10,086 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - L5:P4:BPM.DESC connected is false
2022-02-07 22:51:10,090 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EngineContext - Starting meta channels for PTB:PV1:BPM.NSAM
2022-02-07 22:51:10,090 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - Starting up monitors on the fields for pv PTB:PV1:BPM.NSAM
2022-02-07 22:51:10,092 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofPTB:PV1:BPM.DESC connectting
2022-02-07 22:51:10,092 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - Done starting up monitors on the fields for pv PTB:PV1:BPM.NSAM
2022-02-07 22:51:10,093 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EngineContext - Starting meta channels for LI:VD1:y:fit:cal:sigmaM
2022-02-07 22:51:10,093 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - Starting up monitors on the fields for pv LI:VD1:y:fit:cal:sigmaM
2022-02-07 22:51:10,093 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.HIHI connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.HIGH connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.LOW connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.LOLO connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.LOPR connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.HOPR connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.pv.EPICS_V3_PV - pv ofLI:VD1:y:fit:cal:sigmaM.DESC connectting
2022-02-07 22:51:10,094 [Engine scheduler for misc tasks.] DEBUG org.epics.archiverappliance.engine.model.ArchiveChannel - Done starting up monitors on the fields for pv LI:VD1:y:fit:cal:sigmaM
[....many thousands on connections]
I believe METACHANNELS_TO_START_AT_A_TIME can be used to throttle this process - can it be made into a configurable settings, or some other throttling mechanism added? Also, a potential optimization could be to skip broadcast searches and connect directly to IOC IP of main channel.
Our archiver instance (10k active/60k total PVs) started triggering broadcast storms after server upgrade. This happens once at 20 minutes after startup. I traced this to the connection of all the meta channels on first run of DisconnectChecker. Relevant log fragment:
I believe
METACHANNELS_TO_START_AT_A_TIME
can be used to throttle this process - can it be made into a configurable settings, or some other throttling mechanism added? Also, a potential optimization could be to skip broadcast searches and connect directly to IOC IP of main channel.