root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.7k stars 1.28k forks source link

[RF] regex_error when selecting pdf components to plot #7115

Closed dlanci closed 3 years ago

dlanci commented 3 years ago

I’ve recently updated my ROOT version to v6.20.06 and my once working code snippet to plot several pdf components on the same canvas broke.

No matter if I select the pdf components by object reference or by name, i.e. by:

modelTot.plotOn(frame, RooFit::Components(“ modelBkgTotTrig*”),
FillColor(93), LineColor(93), DrawOption(“F”));

or by:

modelTot.plotOn(frame, RooFit::Components(RooArgSet(expo)),
FillColor(93), LineColor(93), DrawOption(“F”));

I get:

[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) directly selected PDF components: (modelBkgTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) indirectly selected PDF components: ()
terminate called after throwing an instance of 'std::regex_error'
  what():  Unexpected character in brace expression.
Aborted (core dumped)

It looks like they are found but then the regex_error appears. Is this a known issure or shall I do something different to select the pdfs to plot?

Thanks, Davide

dlanci commented 3 years ago

I investigated a bit more in depth the problem and it seems like the regex_error gets thrown as soon as I plot a composite pdf or by calling RooFit::Components, in fact given the previous example with expo being a simple exponential

RooExponential expo(("modelBkgTot"+trigCatS).c_str(), ("modelBkgTot"+trigCatS).c_str(), *B_plus_DTFM_M_zero,  lambda);
expo.plotOn(frame);

works, but as soon as I do for example:

RooAddPdf modelSigTot(("modelSigTot"+trigCatS).c_str(), ("modelSigTot"+trigCatS).c_str(), RooArgList(modelSig0, modelSig1, modelSig2), RooArgList(frac0gamma, frac1gamma));
modelSigTot.plotOn(frame)

I get:

terminate called after throwing an instance of 'std::regex_error'
  what():  Unexpected character in brace expression.
Aborted (core dumped)
eguiraud commented 3 years ago

Hi @dlanci , could you please check whether v6.22 is still affected? You can install it from conda/homebrew/snap/..., grab it from the LCG releases on lxplus or just run it in a docker container, see https://root.cern/install.

dlanci commented 3 years ago

Hi Enrico,

I installed the same root version in a conda environment and I'm able to plot without problems in pyROOT. Meanwhile I also found that in the system compiler I'm using for my project which is Centos7, among other problems, it looks like std::regex is broken

https://root-forum.cern.ch/t/std-regex-with-root-interpreter/35609

so it looks like the problem resides in how I compile my project rather than in ROOT itself, I'm gonna have to investigate deeper on this.

Thanks, Davide

eguiraud commented 3 years ago

Oh, yes std::regex is known to be broken in gcc 4.8 -- but that means we shouldn't be using it @lmoneta @hageboeck ?

hageboeck commented 3 years ago

Hi,

  1. Yes, we are not using std::regex when a broken gcc is used or when running on windows.
  2. There was a problem with regexes on Mac, which has been fixed in e0550db0059c17919dc91ad64f5d9d302999f6cc, and been backported to 6.22.04 in 6fecf68733cc675e55eb6ae815993fcac375827a.

@dlanci You didn't say if you are on Mac. Maybe you are seeing the same issue?

eguiraud commented 3 years ago

Ok the #ifdef is not catching this toolchain. Seems to be Centos7 from a previous message -- are you using the default system compiler or are you getting it from devtoolset?

dlanci commented 3 years ago

@hageboeck I'm on CentOS7, and using the default system compiler which is gcc 4.8.5.

dlanci commented 3 years ago

I investigated a bit more the problem and I can plot components of a total PDF with my script only if the parameters in the paramBox (from model->paramOn(frame)) have only alphanumerical characters. I was setting their names using the latex titles so that they would appear with latex formatting on the plot.

So actually the regex_error is thrown by the names of the variables that I'm plotting, did something change in the interpretation of latex typesetting in names or titles of roofit variables in v6.20.06?

Cheers, Davide

hageboeck commented 3 years ago

RooFormula was overhauled, which seems to be the only place where regexes show up in diffs. I think there's two simple things we can do to trace it down in a second:

In the end, it can probably be solved by escaping the relevant characters in the name of an object.

dlanci commented 3 years ago

Here is an example from my code. This is the snippet I use to retrieve the workspace (download here: example workspace) after the fit and plot the pdf components stacked onto each other.

fw=r.TFile("exampleWorkspace.root")
ws=fw.Get("workspaceDataFitForPlotTrig0")

#retrieve model and dataset 
B_plus_M=ws.var("B_plus_DTFM_M_zero")
data=ws.data("dataTrig0")
modelTot = ws.pdf("modelTot0")

B_plus_M.setBins(80);        
frame = B_plus_M.frame()
data.plotOn(frame)
modelTot.plotOn(frame, r.RooFit.LineColor(r.kRed))

#retrieve model variables
modelTot.getVariables()

# here is the list of variables I want to plot
for i, v in enumerate(modelTot.getVariables()):    
    if not v.isConstant() and v.GetName() != B_plus_M.GetName():
        print(v.GetTitle())
        v.SetName(v.GetTitle())

This will output the var titles I gave to the variables when I defined them, in LaTeX formatting and set them as var names for plotting:

N_{comb.}
f_{0#gamma}
f_{1#gamma}
N_{charm}/N_{strange}
N_{#pi}/N_{K}
#lambda
#Delta_{#mu}
N_{prc}
N_{sig}
s_{#sigma}

Now if I continue with:


trigCatPrc = "Trig0Phot-1";
trigCatS = "Trig0";

modelTot.plotOn(frame, r.RooFit.Name("Piee"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateRarePrc"+trigCatPrc+",templateCharmPrc"+trigCatPrc+",modelPieeTot"+trigCatS), 
      r.RooFit.FillColor(93), r.RooFit.LineColor(93), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("RarePrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc+",templateRarePrc"+trigCatPrc), 
      r.RooFit.FillColor(95), r.RooFit.LineColor(95), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("CharmPrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc), 
      r.RooFit.FillColor(94), r.RooFit.LineColor(94), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("comb"), r.RooFit.Components("modelBkgTot"+trigCatS), 
      r.RooFit.FillColor(92), r.RooFit.LineColor(92), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("sig"), r.RooFit.LineColor(r.kBlack), r.RooFit.Components("modelSigTot"+trigCatS ), r.RooFit.LineStyle(2));

data.plotOn(frame);

modelTot.paramOn(frame, r.RooFit.Layout(0.65,0.85,0.95))
frame.getAttLine("modelTot"+str(trigCat)+"_paramBox").SetLineWidth(0);

I get the regex error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-543-f3142912b8db> in <module>
----> 1 modelTot.plotOn(frame, r.RooFit.Name("Piee"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateRarePrc"+trigCatPrc+",templateCharmPrc"+trigCatPrc+",modelPieeTot"+trigCatS), 
      2       r.RooFit.FillColor(93), r.RooFit.LineColor(93), r.RooFit.DrawOption("F"));
      3 modelTot.plotOn(frame, r.RooFit.Name("RarePrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc+",templateRarePrc"+trigCatPrc), 
      4       r.RooFit.FillColor(95), r.RooFit.LineColor(95), r.RooFit.DrawOption("F"));
      5 modelTot.plotOn(frame, r.RooFit.Name("CharmPrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc), 

TypeError: none of the 2 overloaded methods succeeded. Full details:
  RooPlot* RooAbsPdf::plotOn(RooPlot* frame, const RooCmdArg& arg1 = RooCmdArg::none(), const RooCmdArg& arg2 = RooCmdArg::none(), const RooCmdArg& arg3 = RooCmdArg::none(), const RooCmdArg& arg4 = RooCmdArg::none(), const RooCmdArg& arg5 = RooCmdArg::none(), const RooCmdArg& arg6 = RooCmdArg::none(), const RooCmdArg& arg7 = RooCmdArg::none(), const RooCmdArg& arg8 = RooCmdArg::none(), const RooCmdArg& arg9 = RooCmdArg::none(), const RooCmdArg& arg10 = RooCmdArg::none()) =>
    Unexpected character in brace expression. (C++ exception of type regex_error)
  RooPlot* RooAbsPdf::plotOn(RooPlot* frame, RooLinkedList& cmdList) =>
    takes at most 2 arguments (6 given)

[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) directly selected PDF components: (modelBkgTotTrig0,templateRarePrcTrig0Phot-1,templateCharmPrcTrig0Phot-1,modelPieeTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) indirectly selected PDF components: (cb1PieeTrig0,shiftedMeanPieeTrig0,scaledSigmaPieeTrig0,cb2PieeTrig0,scaledArPieeTrig0,modelPrcTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) directly selected PDF components: (modelBkgTotTrig0,templateRarePrcTrig0Phot-1,templateCharmPrcTrig0Phot-1,modelPieeTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) indirectly selected PDF components: (cb1PieeTrig0,shiftedMeanPieeTrig0,scaledSigmaPieeTrig0,cb2PieeTrig0,scaledArPieeTrig0,modelPrcTotTrig0)

If I remove the LaTeX formatting from some of the variables though (interestingly enough those that were formulas) and instead of doing v.SetName(v.GetTitle()) in the above for loop I do:

v=ws.var("sigTrig0")
v.SetName("Nsig")
v=ws.var("fracCharmPrcTrig0")
v.SetName("(N c)/(N s) ")
v=ws.var("frac0gammaTrig0")
v.SetName("f 1#gamma")
v=ws.var("frac1gammaTrig0")
v.SetName("f 0#gamma")
v=ws.var("fracPieeTrig0")
v.SetName("(N #pi)/(N K) ")
v=ws.var("meanShiftTrig0")
v.SetName("#Delta #mu")
v=ws.var("sigmaScaleFactorTrig0")
v.SetName("s #sigma")

My code plots successfully and I obtain this: plotKmumuMCLogy_fix.pdf

Which weirdly enough has a strange white cut on the exponential background. With root v6.08 I would obtain something like this: plotKeeDataLogyTrig3.pdf

hageboeck commented 3 years ago

Ok, problem found. It's the { } in the parameter name. They have special meaning in a regex. Note that if you use something without special characters for the name of a parameter, you can use anything you want in the title. The name identifies elements of the computation, the title is what ends up on axis labels.

Anyway, RooFit should be protected against that. I call @guitargeek to fix it. 😉

Happy bugfixing! 😄

guitargeek commented 3 years ago

Fixed by:

dlanci commented 3 years ago

Thanks a lot!

dlanci commented 3 years ago

Here is an example from my code. This is the snippet I use to retrieve the workspace (download here: example workspace) after the fit and plot the pdf components stacked onto each other.

fw=r.TFile("exampleWorkspace.root")
ws=fw.Get("workspaceDataFitForPlotTrig0")

#retrieve model and dataset 
B_plus_M=ws.var("B_plus_DTFM_M_zero")
data=ws.data("dataTrig0")
modelTot = ws.pdf("modelTot0")

B_plus_M.setBins(80);        
frame = B_plus_M.frame()
data.plotOn(frame)
modelTot.plotOn(frame, r.RooFit.LineColor(r.kRed))

#retrieve model variables
modelTot.getVariables()

# here is the list of variables I want to plot
for i, v in enumerate(modelTot.getVariables()):    
    if not v.isConstant() and v.GetName() != B_plus_M.GetName():
        print(v.GetTitle())
        v.SetName(v.GetTitle())

This will output the var titles I gave to the variables when I defined them, in LaTeX formatting and set them as var names for plotting:

N_{comb.}
f_{0#gamma}
f_{1#gamma}
N_{charm}/N_{strange}
N_{#pi}/N_{K}
#lambda
#Delta_{#mu}
N_{prc}
N_{sig}
s_{#sigma}

Now if I continue with:


trigCatPrc = "Trig0Phot-1";
trigCatS = "Trig0";

modelTot.plotOn(frame, r.RooFit.Name("Piee"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateRarePrc"+trigCatPrc+",templateCharmPrc"+trigCatPrc+",modelPieeTot"+trigCatS), 
      r.RooFit.FillColor(93), r.RooFit.LineColor(93), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("RarePrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc+",templateRarePrc"+trigCatPrc), 
      r.RooFit.FillColor(95), r.RooFit.LineColor(95), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("CharmPrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc), 
      r.RooFit.FillColor(94), r.RooFit.LineColor(94), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("comb"), r.RooFit.Components("modelBkgTot"+trigCatS), 
      r.RooFit.FillColor(92), r.RooFit.LineColor(92), r.RooFit.DrawOption("F"));
modelTot.plotOn(frame, r.RooFit.Name("sig"), r.RooFit.LineColor(r.kBlack), r.RooFit.Components("modelSigTot"+trigCatS ), r.RooFit.LineStyle(2));

data.plotOn(frame);

modelTot.paramOn(frame, r.RooFit.Layout(0.65,0.85,0.95))
frame.getAttLine("modelTot"+str(trigCat)+"_paramBox").SetLineWidth(0);

I get the regex error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-543-f3142912b8db> in <module>
----> 1 modelTot.plotOn(frame, r.RooFit.Name("Piee"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateRarePrc"+trigCatPrc+",templateCharmPrc"+trigCatPrc+",modelPieeTot"+trigCatS), 
      2       r.RooFit.FillColor(93), r.RooFit.LineColor(93), r.RooFit.DrawOption("F"));
      3 modelTot.plotOn(frame, r.RooFit.Name("RarePrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc+",templateRarePrc"+trigCatPrc), 
      4       r.RooFit.FillColor(95), r.RooFit.LineColor(95), r.RooFit.DrawOption("F"));
      5 modelTot.plotOn(frame, r.RooFit.Name("CharmPrc"), r.RooFit.Components("modelBkgTot"+trigCatS+",templateCharmPrc"+trigCatPrc), 

TypeError: none of the 2 overloaded methods succeeded. Full details:
  RooPlot* RooAbsPdf::plotOn(RooPlot* frame, const RooCmdArg& arg1 = RooCmdArg::none(), const RooCmdArg& arg2 = RooCmdArg::none(), const RooCmdArg& arg3 = RooCmdArg::none(), const RooCmdArg& arg4 = RooCmdArg::none(), const RooCmdArg& arg5 = RooCmdArg::none(), const RooCmdArg& arg6 = RooCmdArg::none(), const RooCmdArg& arg7 = RooCmdArg::none(), const RooCmdArg& arg8 = RooCmdArg::none(), const RooCmdArg& arg9 = RooCmdArg::none(), const RooCmdArg& arg10 = RooCmdArg::none()) =>
    Unexpected character in brace expression. (C++ exception of type regex_error)
  RooPlot* RooAbsPdf::plotOn(RooPlot* frame, RooLinkedList& cmdList) =>
    takes at most 2 arguments (6 given)

[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) directly selected PDF components: (modelBkgTotTrig0,templateRarePrcTrig0Phot-1,templateCharmPrcTrig0Phot-1,modelPieeTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) indirectly selected PDF components: (cb1PieeTrig0,shiftedMeanPieeTrig0,scaledSigmaPieeTrig0,cb2PieeTrig0,scaledArPieeTrig0,modelPrcTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) directly selected PDF components: (modelBkgTotTrig0,templateRarePrcTrig0Phot-1,templateCharmPrcTrig0Phot-1,modelPieeTotTrig0)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(modelTot0) indirectly selected PDF components: (cb1PieeTrig0,shiftedMeanPieeTrig0,scaledSigmaPieeTrig0,cb2PieeTrig0,scaledArPieeTrig0,modelPrcTotTrig0)

If I remove the LaTeX formatting from some of the variables though (interestingly enough those that were formulas) and instead of doing v.SetName(v.GetTitle()) in the above for loop I do:

v=ws.var("sigTrig0")
v.SetName("Nsig")
v=ws.var("fracCharmPrcTrig0")
v.SetName("(N c)/(N s) ")
v=ws.var("frac0gammaTrig0")
v.SetName("f 1#gamma")
v=ws.var("frac1gammaTrig0")
v.SetName("f 0#gamma")
v=ws.var("fracPieeTrig0")
v.SetName("(N #pi)/(N K) ")
v=ws.var("meanShiftTrig0")
v.SetName("#Delta #mu")
v=ws.var("sigmaScaleFactorTrig0")
v.SetName("s #sigma")

My code plots successfully and I obtain this: plotKmumuMCLogy_fix.pdf

Which weirdly enough has a strange white cut on the exponential background. With root v6.08 I would obtain something like this: plotKeeDataLogyTrig3.pdf

I'm seeing that the "polygon not closing issue" is not only happening to me (see https://sft.its.cern.ch/jira/browse/ROOT-10931) was this fixed on version 6.20.06?

hageboeck commented 3 years ago

Hello,

No, Jira it says that the fixes are in 6.22.02 and 6.24.00 I remember that there were two related but different problems with the polygons. One was fixed a while back, it should be somewhere in 6.20, but the second one only in the versions mentioned above. If you can, try out ROOT's nightly on lxplus. If it's fixed, all good. If it persists, there is a third thing going on, and you should open a ticket.

If it's all OK in the above versions, but you cannot use those, you can ask guitargeek very nicely to port the fix to the next 6.20.xx.