JeffersonLab / halld_recon

Reconstruction for the GlueX Detector
6 stars 8 forks source link

Protection needed for reading values from the calibration database? #379

Open mashephe opened 4 years ago

mashephe commented 4 years ago

The bug report submitted by Igal suggests a vulnerability in our common method of reading constants from the database. Code typically looks like this:

map<string, double> fcal_parms; loop->GetCalib("FCAL/fcal_parms", fcal_parms);

One then uses the [] operator to lookup elements of the map, e.g.:

double my_const = fcal_parms[ "my_const" ];

However, if "my_const" is not in the table, then by the STL standard, it will be created in the map and initialized to a value of zero. This means that a typo in the string for any constant lookup will not generate an error. A value of zero will be substituted for the constant and the code will run fine.

The bug Igal noticed has probably been in the code for years -- putting in a value of zero is just not wrong enough to be obviously wrong in many cases. This seems like a potentially dangerous trap that we should consider eliminating.

There are probably a number of ways to do it, but I would be inclined to do something along these lines:

I think this is probably a JANA issue, right?

If so, then it may generate a backwards compatibility annoyance, but I think that can be handled. If the new JANA constant container class is called JConstantMap then one needs to change every constant lookup from map<string, double> -> JConstantMap and probably add a line like:

ifndef NEW_JANA_CONST

typedef JConstantMap map<string,double>

endif

To a header on the halld_recon side. And also long as the new JANA version has

define NEW_JANA_CONST

then the code will continue to compile and run with old JANA versions as it does now.

sdobbs commented 4 years ago

I'll let @faustus123 handle the JANA side of this discussion, but a straightforward short-term solution would be to switch from using map::[] to map::at(), which throws an exception if the requested key is not in the map.

mashephe commented 4 years ago

Yes, but that requires users are aware of the pitfall and willingly adopt this practice. It will be hard to encourage people to do this and impossible to enforce compliance. One can in principle search/replace through the code now, but it won’t prevent the danger from re-emerging in the future.

Matt

On May 22, 2020, at 8:19 AM, Sean Dobbs notifications@github.com wrote:

 I'll let @faustus123 handle the JANA side of this discussion, but a straightforward short-term solution would be to switch from using map::[] to map::at(), which throws an exception if the requested key is not in the map.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

rjones30 commented 4 years ago

What I do in HDGeant4 is I always do either a Find(key) followed by a test if the (result == npos) or else using the .at(key) lookup method that Sean recommends. Either pattern works. I have not searched every corner of halld_recon, but I had the general impression that others were following these patterns as well, since they have been standard STL usage patterns for a long time. The fact that they are standard patterns does not mean everyone uses them, of course.

-Richard Jones

On Fri, May 22, 2020 at 8:24 AM Matthew Shepherd notifications@github.com wrote:

Yes, but that requires users are aware of the pitfall and willingly adopt this practice. It will be hard to encourage people to do this and impossible to enforce compliance. One can in principle search/replace through the code now, but it won’t prevent the danger from re-emerging in the future.

Matt

On May 22, 2020, at 8:19 AM, Sean Dobbs notifications@github.com wrote:

 I'll let @faustus123 handle the JANA side of this discussion, but a straightforward short-term solution would be to switch from using map::[] to map::at(), which throws an exception if the requested key is not in the map.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/379#issuecomment-632665244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3YKWGJXTY5PSUWAE46NQLRSZVJNANCNFSM4NHVWO5A .

mashephe commented 4 years ago

Hi Richard,

Indeed what you describe is a safe method, but I would bet that people would rather not be bothered with typing all those extra lines. My experience is that people are happy when the code compiles and runs and get syntax help from copy and pasting compiler errors into google.

The most frequent reason that people resort to at() or find() to retrieve an iterator is because they are forced to. Those two methods are const methods of the STL map. The operator[] is not a const method because its behavior is to invoke the default constructor of the second item in the pair if the key does not exist in the map and add the key to the map. It is intended to be used to insert items as well as retrieve items. It is nice, but a bit of a double-edged sword.

I agree that people who have significant experience with STL may be inclined to code as you and Sean suggest, but there is nothing to encourage or require this. The common structure in our code deals with the map to constants as a local variable.

I see GetCalib is called about 100 times in halld_recon. And there are several data structures besides map so I don't know how big of a potential issues it is.

Here are a few examples that come to the top of a search for GetCalib:

std::map<string,double> result1; loop->GetCalib("/PHOTON_BEAM/endpoint_energy", result1); if (result1.find("PHOTON_BEAM_ENDPOINT_ENERGY") == result1.end()) { std::cerr << "Error in DTAGHGeometry constructor: " << "failed to read photon beam endpoint energy " << "from calibdb at /PHOTON_BEAM/endpoint_energy" << std::endl; m_endpoint_energy_GeV = 0; } else { m_endpoint_energy_GeV = result1["PHOTON_BEAM_ENDPOINT_ENERGY"]; }

I saw several constructions like this in other pieces of code also, but a copy/paste error or misplaced character in the string in the last line will get a zero return value with no error. This is a good attempt at protecting oneself, but it got fumbled at the last line.

Or this:

std::vector<std::map<string,double> > result2; loop->GetCalib("/PHOTON_BEAM/hodoscope/scaled_energy_range", result2); if (result2.size() != kCounterCount) { jerr << "Error in DTAGHGeometry constructor: " << "failed to read photon beam scaled_energy_range table " << "from calibdb at /PHOTON_BEAM/hodoscope/scaled_energy_range" << std::endl; for (unsigned int i=0; i <= TAGH_MAX_COUNTER; ++i) { m_counter_xlow[i] = 0; m_counter_xhigh[i] = 0; } } else { for (unsigned int i=0; i < result2.size(); ++i) { int ctr = (result2[i])["counter"]; m_counter_xlow[ctr] = (result2[i])["xlow"]; m_counter_xhigh[ctr] = (result2[i])["xhigh"]; } }

Here the map lookups in the else are unprotected.

I looked through a handful of the many other examples, and I didn't find one that declared name the relevant constant as a const string variable and then used that variable to both do the map checking and reference the element. I also did not find one that used the iterator pointing to the map element that is returned by the find method to actually get the element -- they only be sure it is not the end of the map. I would consider either of those to be best practice and safe.

My experience is that we can't assume all contributors to code are C++ or STL experts. Almost no students have formal training in this stuff these days. In that case, when there is a large code base with many contributors it is useful to try to put certain protections in place. It is a design philosophy... not right or wrong. Hence this is a feature suggestion. Maybe the risk is sufficiently small that it doesn't merit the time required to change it and the ~100 different places in the code. I can only point to one bug that was quietly lurking for quite some time because zero wasn't wrong enough to be obviously wrong.

(This type of stuff was drilled into my head as a grad student by Chris Jones, who was one of the architects of the CLEO-III analysis framework. Apparently burning months of effort chasing reproducibility issues and mysterious memory corruption problems in CLEO-II was common, and it left a few people with very strong opinions about writing a robust framework that non-experts could to work in. It is kind of like putting plastic plugs over electrical outlets for children... it requires a few extra steps for those who want to use appliances, but avoids accidentally stumbling into an undesirable situation.)

Matt

On May 22, 2020, at 9:37 AM, Richard Jones notifications@github.com<mailto:notifications@github.com> wrote:

What I do in HDGeant4 is I always do either a Find(key) followed by a test if the (result == npos) or else using the .at(key) lookup method that Sean recommends. Either pattern works. I have not searched every corner of halld_recon, but I had the general impression that others were following these patterns as well, since they have been standard STL usage patterns for a long time. The fact that they are standard patterns does not mean everyone uses them, of course.

-Richard Jones

On Fri, May 22, 2020 at 8:24 AM Matthew Shepherd notifications@github.com<mailto:notifications@github.com> wrote:

Yes, but that requires users are aware of the pitfall and willingly adopt this practice. It will be hard to encourage people to do this and impossible to enforce compliance. One can in principle search/replace through the code now, but it won’t prevent the danger from re-emerging in the future.

Matt

On May 22, 2020, at 8:19 AM, Sean Dobbs notifications@github.com<mailto:notifications@github.com> wrote:

 I'll let @faustus123 handle the JANA side of this discussion, but a straightforward short-term solution would be to switch from using map::[] to map::at(), which throws an exception if the requested key is not in the map.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/379#issuecomment-632665244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3YKWGJXTY5PSUWAE46NQLRSZVJNANCNFSM4NHVWO5A .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/JeffersonLab/halld_recon/issues/379#issuecomment-632694301, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADG5L2GGQCD7F4VAQTP5HZ3RSZ5YDANCNFSM4NHVWO5A.