Closed tquante closed 8 years ago
Example file from Thomas:
Dear Thomas,
I just ran your code with the latest version of BAT, see #98, and root 6.04.10 and multiple threads in BAT. I did not get a segfault. Which exact version of BAT did you use? It could well be that we have fixed your problem in the mean time since I fixed a large number of things in BCHistogramFitter and related fast fitter classes.
Dear Frederik,
I have tested your version and it seems that the model manager issue is fixed, but the CUBA integration leads to a "free an invalid pointer" issue. The only things i changed in the example mentioned above are
TH1D* ->TH1 TF1\ ->TF1
new BCHstogrammFitter( * hist,f1) new BCHstogrammFitter( * hist,f2)
BAT: your current fitter branch root: 6.00.02
What is the reason for using const TH1& instead of const TH1* ? I need to instantiate TH1* hist=new TH1X() anyway because the typical constructors of TH1 are protected or private
Regards
Thomas
I have tested your version and it seems that the model manager issue is fixed, but the CUBA integration leads to a "free an invalid pointer" issue.
Ok, I don't get that error but I have seen it before. Might have something to do with the fact evaluating TF1
is not thread safe. I had to maneuver around that and create independent copies, one for each Markov chain. Could you check if the error persists when setting
CUBACORES=1 root -b -q -l histogramFitterExample.C
The only things i changed in the example mentioned above are
TH1D* ->TH1 TF1\ ->TF1
new BCHstogrammFitter( * hist,f1) new BCHstogrammFitter( * hist,f2)
BAT: your current fitter branch root: 6.00.02
What is the reason for using const TH1& instead of const TH1* ? I need to instantiate TH1* hist=new TH1X() anyway because the typical constructors of TH1 are protected or private
As part of making the fast fitters (includes BCHistogramFitter
) thread and memory safe, I started to keep an independent copy of the histogram, so there is no concern about who needs to free the memory. To make this obvious, I expect a reference which in addition cannot be nullptr
. For the user convenience, I take a TH1&
which could refer to a TH1D
but conceptually the counts need to be integer, so when I copy the member is TH1I fHistogram;
. In the constructor
BCHistogramFitter::BCHistogramFitter(const TH1& hist, const TF1& func, const std::string& name)
: BCFitter(func, name)
{
hist.Copy(fHistogram);
...
fHistogram
is default initialized, then I use TH1::Copy
that does all the nasty type checks. Strangely enough, even the TH1I
stores the count as a double
. I guess that's ROOT
style :smile:
Regarding your second comment: I would just do
TF1 f(...);
TH1I hist(...);
...
BCHistogramFitter hf(f, hist);
and there is no problem with TH1
being abstract.
The following command still lead to a free of invalid ptr when the second integration result is plotted. CUBACORES=1 root -b -q -l histogramFitterExample_with_modelmanager.C+
I think it has something to do with the termination of the single CUBA thread. Afterwards the programm runs till the end and then root throws another free of invalid ptr error.
The TH1I has only an 32bit integer array not a double, in case someone has more than 2147483647 entries in one bin it will fail, this can happen in case of huge detector arrays during calibrations. Also scaling of such histograms lead to accuracy problems. Therefore, all of our histograms are floats or even doubles.
The TH1I
max value of 2147483647 bin bin seems a limitation to me, too - it it can easily be exceeded in actual use cases. As ROOT doesn't seem to have an 64-bit integer histogram type, we shouldn't convert TH1D
histograms to TH1I
.
To be clear, since I think Fred had a typo in his message that is causing some confusion. We allow the user to pass any type of TH1, including TH1I, but we convert it to TH1D for use in BAT.
On Fri, Nov 27, 2015, 10:47 Oliver Schulz notifications@github.com wrote:
The TH1I max value of 2147483647 bin bin seems a limitation to me, too - it it can easily be exceeded in actual use cases. As ROOT doesn't seem to have an 64-bit integer histogram type, we shouldn't convert TH1D histograms to TH1I.
— Reply to this email directly or view it on GitHub https://github.com/bat/bat/issues/92#issuecomment-160098019.
Daniel Greenwald Physik-Department E18 Technische Universität München James-Franck-Str. 1 85748 Garching Germany
daniel.greenwald@tum.de tel : +49 89 / 289 12569 fax : +49 89 / 289 12570
The following command still lead to a free of invalid ptr when the second integration result is plotted. CUBACORES=1 root -b -q -l histogramFitterExample_with_modelmanager.C+
I think it has something to do with the termination of the single CUBA thread. Afterwards the programm runs till the end and then root throws another free of invalid ptr error.
@tquante what version of root did you use? I installed root 5.34.19 today and started to also see the same errors. I did not see them with 5.34.25 or 6.04.10. So I guess it's some bug related to a pretty old version of root
Sry for the late answer. I currently use root 6.00.02. I will check if the problem is still existing for me if I use root 6.04.10.
So just for completeness: I transferred the invalid pointer
thing to a separate issue #127.
The original problem has long been solved, so I closed the issue.
To be clear, since I think Fred had a typo in his message that is causing some confusion. We allow the user to pass any type of TH1, including TH1I, but we convert it to TH1D for use in BAT.
It was not typo, for a brief period we had TH1I
in the code but upon your request I reverted to a TH1D inside BCHistogramFitter
,
Thank you very much for fixing this bug! Now I can start again working on the Bat 1.0 compatible version of my analysis.
On 12/10/2015 03:49 PM, Frederik Beaujean wrote:
So just for completeness: I transferred the |invalid pointer| thing to a separate issue #127 https://github.com/bat/bat/issues/127.
The original problem has long been solved, so I closed the issue.
To be clear, since I think Fred had a typo in his message that is causing some confusion. We allow the user to pass any type of TH1, including TH1I, but we convert it to TH1D for use in BAT.
It was not typo, for a brief period we had |TH1I| in the code but upon your request I reverted to a TH1D inside |BCHistogramFitter|,
— Reply to this email directly or view it on GitHub https://github.com/bat/bat/issues/92#issuecomment-163646267.
Thomas Quante Technische Universität Dortmund Experimentelle Physik IV Otto-Hahn-Str. 4 D-44227 Dortmund Germany
email: thomas.quante@tu-dortmund.de phone: +49 231 755 3690 fax: +49 231 755 3688
Dear Devs,
I have found an issue with the ModelManager which lead to a segfault. I have updated the HistogramFitter example with a second TF1 with the BG only hypothesis and tried to calculate the Bayes factor. The segfault apears after the prerun of the MCMC:
Summary : Opening logfile log.txt Summary : Marginalize using Metropolis Summary : Pre-run Metropolis MCMC for model "fitter_model_f1" ... Summary : --> Perform MCMC pre-run with 2 chains, each with maximum 10000 iterations Detail : * Running until at least 1500 iterations performed in prerun. Current iteration is 500 Detail : * Running until at least 1500 iterations performed in prerun. Current iteration is 1000 Summary : --> Set of 2 Markov chains converged within 1500 iterations, and all scales are adjusted. Detail : --> Average scale factors and efficiencies (measured in last 500 iterations): Detail : - Parameter : Scale factor Efficiency Detail : SignalYield : 28.87 % 38.3 % Detail : SignalMean : 28.87 % 26.3 % Detail : SignalSigma : 28.87 % 39.9 % Detail : BackgroundYield : 28.87 % 37.5 % Summary : Run Metropolis MCMC for model "fitter_model_f1" ... Summary : --> Perform MCMC run with 2 chains, each with 10000 iterations. * Break * segmentation violation
At first I thought that it is related with the MCMCInterface bug I reported last week, but it is the ModelManager which causes the problem.
I wrote an email with the source code to Oli Schulz, because I can not attach anything. I have the same issue with my analysis code which runs without any errors with BAT 0.9.4.
Hopefully this will help you to hunt the bug.
Regards Thomas