cms-gem-daq-project / ctp7_modules

0 stars 13 forks source link

Singletons and guards for RPC methods #148

Open lmoureaux opened 5 years ago

lmoureaux commented 5 years ago

Brief summary of issue

The RPC modules require some services to be present at runtime (memsvc, LMDB, maybe log4cplus). They are treated in different ways:

I propose to use the following standard patterns:

With this design, I think that we can get rid of the localArgs struct completely, which will simplify module interfaces.

[Note: I'm not 100% sure that the named db can be shared. A peek at the doc is required, but it was clear when I was looking a while ago.]

Types of issue

Expected Behavior

Calls to shared services should be transparent.

Current Behavior

The localArgs struct is a pain (though it's better than passing around many arguments).

Context (for feature requests)

We are essentially rewriting the RPC modules anyway.

jsturdy commented 5 years ago

As mentioned in the meeting today, having a standalone lmdbsvc, rather than an LMDB instance per rpcsvc instance, could be another possibility (I'm not sure how the singleton design is supposed to work for this use case) Only one of the functions requires write access to the LMDB (update_lmdb) and could do the appropriate locking when it is being run (and maybe even restart when done)

log4cplus logger could go into rpcsvc as the current LOGGER is (requires that we collaborate with UW on the maintenance of rpcsvc), and since it's log4cplus, any stanalone program would simply have to create its own logger.

I don't have a good view for what changes are needed for memsvc...

lpetre-ulb commented 5 years ago

As mentioned in the meeting today, having a standalone lmdbsvc, rather than an LMDB instance per rpcsvc instance, could be another possibility (I'm not sure how the singleton design is supposed to work for this use case)

What would be the real advantage of a lmdbsvc daemon? LMDB is already designed to manage the access to the unique resource. Moreover I fear that the DB access performance drop because of IPC communication.

Only one of the functions requires write access to the LMDB (update_lmdb) and could do the appropriate locking when it is being run (and maybe even restart when done)

There is no need to manage the locking ourself; LMDB already locks the database. Indeed every new transaction works on a snapshot of the database and the underlying data can be deleted only once the last read transaction is closed. We only need to be careful to open the database with the right options if we play we threads.

Also, I think we should consider a standalone program to update the database so that it could be run in a post-transaction RPM hook.

log4cplus logger could go into rpcsvc as the current LOGGER is (requires that we collaborate with UW on the maintenance of rpcsvc), and since it's log4cplus, any stanalone program would simply have to create its own logger.

Indeed the best best to initialize the log4cplus logger is in rpcsvc after the fork. However it should be possible to create the logger into the modules with the drawback that each module would create its own logger.

Also log4cplus comes with its own singleton system so I think it is outside the scope of this PR.

I don't have a good view for what changes are needed for memsvc...

The main problem is that memsvc provides a C API with a global static handle. I would convert the memhub library to a C++ library (in xHAL?) with only one class which manages the memsvc handle. Then it is trivial to have one singleton of this class.

jsturdy commented 5 years ago

As mentioned in the meeting today, having a standalone lmdbsvc, rather than an LMDB instance per rpcsvc instance, could be another possibility (I'm not sure how the singleton design is supposed to work for this use case)

What would be the real advantage of a lmdbsvc daemon? LMDB is already designed to manage the access to the unique resource. Moreover I fear that the DB access performance drop because of IPC communication.

The only benefit (that I can see) of doing anything to the current treatment of the LMDB is the reduction of the numbers of copies hanging around as resources become tight, but otherwise I don't see where we gain. If there were no performance cost, I'd just have the LMDB instantiation happen inside the lowest level function that is using it, and by that mechanism remove it from being passed around everywhere (which is the primary reason we should do something)

Only one of the functions requires write access to the LMDB (update_lmdb) and could do the appropriate locking when it is being run (and maybe even restart when done)

There is no need to manage the locking ourself; LMDB already locks the database. Indeed every new transaction works on a snapshot of the database and the underlying data can be deleted only once the last read transaction is closed. We only need to be careful to open the database with the right options if we play we threads.

Also, I think we should consider a standalone program to update the database so that it could be run in a post-transaction RPM hook.

We can (should?) have both, but until the FW is deployed as an RPM I don't feel a strong desire to change the standard operating procedure of using gem_reg do perform the update remotely.

log4cplus logger could go into rpcsvc as the current LOGGER is (requires that we collaborate with UW on the maintenance of rpcsvc), and since it's log4cplus, any stanalone program would simply have to create its own logger.

Indeed the best best to initialize the log4cplus logger is in rpcsvc after the fork. However it should be possible to create the logger into the modules with the drawback that each module would create its own logger.

Yes, this was the drawback I saw of going this route. I don't know the performance hit of this, especially if the logging is enabled in the readReg function, but some global logger does strike my fancy more than this...

Also log4cplus comes with its own singleton system so I think it is outside the scope of this PR.

I don't have a good view for what changes are needed for memsvc...

The main problem is that memsvc provides a C API with a global static handle. I would convert the memhub library to a C++ library (in xHAL?) with only one class which manages the memsvc handle. Then it is trivial to have one singleton of this class.

Is this really a problem? Is there a specific benefit of converting the global to a singleton in this case?

OK, I'll wait for the working minimally working demonstration of these three things (my starting point was converting utils+amc). At the end of the day, I'm :+1: in favour of simplification and robustification!

lpetre-ulb commented 5 years ago

The only benefit (that I can see) of doing anything to the current treatment of the LMDB is the reduction of the numbers of copies hanging around as resources become tight, but otherwise I don't see where we gain. If there were no performance cost, I'd just have the LMDB instantiation happen inside the lowest level function that is using it, and by that mechanism remove it from being passed around everywhere (which is the primary reason we should do something)

I don't understand what would be the resources gains achieved by limiting the number of LMDB environments? Since LMDB is memory-mapped it only consumes virtual memory in the process memory space; the physical memory consumption is constant whatever the number of databases "connections" there is.

I don't have a good view for what changes are needed for memsvc...

The main problem is that memsvc provides a C API with a global static handle. I would convert the memhub library to a C++ library (in xHAL?) with only one class which manages the memsvc handle. Then it is trivial to have one singleton of this class.

Is this really a problem? Is there a specific benefit of converting the global to a singleton in this case?

I was trying to find a good reason instead of just over-designing this part... In the end, the problem is that the memsvc handle is defined in utils.so, but initialized in every RPC module with memhub_open. Therefore, we get a memory leak since the handle is never freed. Also, /dev/mem is possibly memory-mapped more than once (to be verified).

Sure, the memory leak (confirmed with valgrind in the GLIB emulator) is fully contained, but since we can do better...