ISISComputingGroup / IBEX

Top level repository for IBEX stories
5 stars 2 forks source link

DBSVR: cannot serve PVs if too many IOCs have been started #4814

Open Tom-Willemsen opened 5 years ago

Tom-Willemsen commented 5 years ago

The database server serves all PVs from the database in an EPICS waveform. This waveform is hard-coded to contain 64,000 characters.

If many IOCs have been started, for example by running the IOC system tests, the DB server can no longer serve all PVs as the length of the list of PVs now exceeds the size of the waveform. This breaks the "select block" mechanism in the GUI.

This was spotted on my machine as I have run all of the IOC system tests recently, and also confirmed via the logs on reno which runs the IOC tests. An error comes up in the GUI whenever the GUI tries to read this "bad" pv, which appears to be about once per second.

The error in the GUI is:

Exception occured, code: gov.aps.jca.CAStatus[TOLARGE=9,WARNING=0]=The requested transfer is greater than available memory or EPICS_CA_MAX_ARRAY_BYTES, message: 'unable to fit read subscription update response into server's buffer'.

And the corresponding error in the DBSVR log is:

[1570125548.35] MAJOR: Too much data to encode PV TE:NDW1799:CS:BLOCKSERVER:PVS:ALL. Current size is 64000 characters but 85786 are required

Acceptance criteria:

Notes:

ChrisM-S commented 5 years ago

Should be we looking within EPICS (7 I guess?) for a mechanism to cache our multiple block structures (say on the 100-1000 blocks) locally (in memory) for both client and server for quick access? Should be trivially low memory load, i.e. 64kB now, 64MB in a few years.

This would allow us to keep using the client/server communication across the network efficiently. Direct SQL access to MySQL, whilst still being a good home for providing a load-up and write-back location for blocks on the server will definitely die under any sort of individually looped query loading. Also anything which does not provide a fast (in memory) local cache on the client will die under any sort of sustained iteration or search because of the network latency of communication to the server.

We could also look outside EPICS to something like Mongo DB, or any other associative (name=>Blockvalue) type of mapping DBMS designed to backend web sites (in memory) and accessed through a REST model? (or a least use this as a design pattern to build the same thing within EPICS).

just some thoughts whilst they come to mind!