SuperNEMO-DBD / Falaise

Simulation, Reconstruction and Analysis Software for the SuperNEMO Experiment
http://supernemo.org/Falaise
GNU General Public License v3.0
5 stars 27 forks source link

Design and implement Conditions Data access in Falaise #90

Open drbenmorgan opened 6 years ago

drbenmorgan commented 6 years ago

Kicked off by @emchauve 's mail to the Software list:

Let me open this important thread related to database access in Falaise.

On calorimeter side, we are getting ready to test and prototype the usage of the database in Falaise >modules. Examples in hand are : usage of energy resolution proper to each OM in simulation ; >measure and store calibration constants per OMs ; retrieve these constants and propagate them at >calibration stage. After discussion in the Analysis Board, there was a consensus that calorimeter >would be a good "guinea pig" to put in place such database service!

Was there already any thought on the topic and on the approach we want ? Or perhaps developments >already done ?

One important point is to consider if/when we need an offline functionality of the service. By offline, I >mean getting some informations without internet access to DB servers @ CCIN2P3. It will clearly >depend of the usage and amount of the data itself. Taking the case of energy resolution per OM (520 >parameters) : it is needed for baseline simulation and offline access can be justify as these >parameters are going to be static. A simple properties.conf file of 520 lines read by the DB service is >feasible (These parameters were measured once during production ; in the worse case, we might >have another measurement at LSM with 207Bi electrons, therefore one upgrade to >properties_v2.conf ). This offline approach is going to be tough (but not impossible( for most of time >dependant parameters, like PMT gain or tracker time, which will change every week or month for >individual OM or cells.

So.... what should we do?! :)

drbenmorgan commented 6 years ago

Some initial considerations:

drbenmorgan commented 5 years ago

Adding additional people involved in the discussion.

pfranchini commented 5 years ago

Has any other discussion happened outside of this issue's thread?

drbenmorgan commented 5 years ago

I'll just mention PR #154 here to cross-reference it. A service likely forms the basis of accessing Conditions Data inside the pipeline.

@pfranchini, @robobre will be setting up a meeting next week I think for update/discussion. I'll forward/cc you the details when they're out.

emchauve commented 4 years ago

I am finally into this issue! (branch add-database-service on my Falaise's fork). I have written the base for the database manager and the database service. And working now on DB connexion.

I am suffering 2 issues due to lack of expertise on brew and cmake, @drbenmorgan you might be able to guide me :

-- I am using MySQL++ which provides a C++ wrapper for mysql-client, installed with brew for which openssl@1.1 is required. Bayeux is also relying on openssl but default version (1.0) and I suspect I am having conflicts at runtime with that. How to handle that ? Should I switch all Bayeux dependencies to openssl@1.1 ? Or is there a solution tor work with both with brew ?

-- How can I add the new dependencies for Falaise (MySQL++ and mysql-client) within CMakeLists.txt in a non-dirty way ? knowing that mysql-client is installed in $BREW/opt/mysql-client without find cmake. There is at least a pkgconfig script for mysql-client, but nothing for mysql++ !

Thanks for your help

drbenmorgan commented 4 years ago

-- I am using MySQL++ which provides a C++ wrapper for mysql-client, installed with brew for which openssl@1.1 is required. Bayeux is also relying on openssl but default version (1.0) and I suspect I am having conflicts at runtime with that. How to handle that ? Should I switch all Bayeux dependencies to openssl@1.1 ? Or is there a solution tor work with both with brew ?

Whilst it's a C-only API (but a good one), could mariadb-c-connector be used instead? It's a much lighter library than mysql plus another lib on top of that.

-- How can I add the new dependencies for Falaise (MySQL++ and mysql-client) within CMakeLists.txt in a non-dirty way ? knowing that mysql-client is installed in $BREW/opt/mysql-client without find cmake. There is at least a pkgconfig script for mysql-client, but nothing for mysql++ !

If they have pkgconfig (.pc) files then CMake's hook to pkgconfig can be used. There's an example of use here:

https://gitlab.cern.ch/lhcb/GitCondDB/blob/master/CMakeLists.txt#L46

and linking here:

https://gitlab.cern.ch/lhcb/GitCondDB/blob/master/CMakeLists.txt#L127

It should be a case of doing:

find_package(PkgConfig)
pkg_check_modules(MYNAME mysql-client REQUIRED IMPORTED_TARGET)
...
target_link_libraries(DBService PRIVATE PkgConfig::MYNAME)
emchauve commented 4 years ago

Thanks for the suggestion. I will investigate mariadb-c-connecter, however it still requires openssl@1.1 from its formula !

fmauger commented 4 years ago

Why should we use MySQL or MariaDB ? The AMI group @ LPSC is supposed to provide us a C++ API to access the SN database system independently of the underlying techno. With the approach you are initiating, you freeze SN code with a given technology and face immediately implementation details rather than considering the problem with some perspective:

fmauger commented 4 years ago

The title of this issue proposed by drbenmorgan is : "Design and implement Conditions Data access in Falaise" not: "Implement a DB system in Falaise".

emchauve commented 4 years ago

The title was modified, but initial topic was indeed implementing DB access in Falaise! @drbenmorgan Would it be possible to add the formula for building C++ AMI API librariry in our Homebrew ? (git repo is: https://github.com/ami-team/cami)

emchauve commented 4 years ago
  • What are the use cases you want to consider first ?

Most simple case: energy calibration of calorimeter OMs with 1 parameter (Energy = a x Charge)

  • What datamodels should be implemented to address calibration and characterization of many detection units ?

We are working on it in parallel, the idea and change from current model is to be able to handle different version of charge and energy (e.g. in a vector) computed/calibrated with different methods. Such dynamic data models would not require modification of members, but just addition of enum for indexing the new version. I hope that make sense ?

  • What data are stable, what data are updated regularly and how this patterns the user interface ?

This question will happen for all data to be store indeed, but I am not sure to understand the point because we need need anyway the interface to get both stable data and update-able data ?

drbenmorgan commented 4 years ago

The title was modified, but initial topic was indeed implementing DB access in Falaise! @drbenmorgan Would it be possible to add the formula for building C++ AMI API librariry in our Homebrew ? (git repo is: https://github.com/ami-team/cami)

As far as I am aware, AMI is not the conditions database! It isn't in ATLAS, see this paper, and this one, especially 2.2.

With the approach you are initiating, you freeze SN code with a given technology and face immediately implementation details rather than considering the problem with some perspective:

Yes and no. I agree that the fundamental issue is the client API, so that can and should be mocked in with what we know to date, e.g.

class CondDBService {
  ... what member functions do users of the service need ...

  # This is probably one of them
  OMParameter getOMParameter(OMID x, IOV i) const;
};

What goes on in the implementation will always be technology dependent, but it is effectively defined for us as SQL (by CC-Lyon), though the LHCb GitCondDB remains an option (and likely will be used for geometry etc). I'm therefore not adverse to the use of SQL libraries at this stage modulo that they are only used as an implementation detail.

@emchauve one thought, could you use the SQLite library for prototyping? It's very simple, similar API to MySQL/MariaDB, and as it's file based can be used offline.

emchauve commented 4 years ago

In fact, the AMI client API is really ultra light, few 100 lines of codes (https://github.com/ami-team/cami/) and the admin web interface is very convenient to handle a common set of users and privileges over different DBs.

There is few different output format provided by the server : text, CSV, JSON or XML. You can give a try there with GetSessionInfo command (the only command available for guest user) : https://ami-supernemo.in2p3.fr/app/?subapp=command

The most interesting output format provided by the server would be JSON I guess (?) for which we will need a parser. Do you have feedback on it and suggestions ?

drbenmorgan commented 4 years ago

For JSON parser, easiest is probably nlohmann-json

Nevertheless, why would we use AMI to access (from Falaise), the CondDB? Could we get confirmation from the AMI developers that this is how it's used in ATLAS to access (from Athena, their Falaise equivalent) actual conditions from the Oracle/COOL/SQLite DBs? It just feels awkward and inefficient to use a web API that will ultimately just query the DB at Lyon.

fmauger commented 2 years ago

TODO: specifications for: