cms-sw / cmsdist

CMS Offline Software build configuration
Other
27 stars 183 forks source link

Add lwtnn as external dependency #2749

Closed mverzett closed 7 years ago

mverzett commented 7 years ago

Dear admins, The BTV team would like to add a new set of taggers based on deep neural networks. We developed a new set of plugins based on lwtnn which we would like to add as external dependency. This issue follows from issue 17085 of cms-sw/cmssw Few info:

cmsbuild commented 7 years ago

A new Issue was created by @mverzett Mauro Verzetti.

@davidlange6, @Dr15Jones, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here cms-sw/cmssw#13029

iahmad-khan commented 7 years ago

@mverzett I will take care of it

slava77 commented 7 years ago

@mverzett I'm just curious about the expected lifetime of this external

Is the functionality going to be integrated with ROOT/TMVA or is it already there?

Do you have some performance tests (memory and CPU profiling) in using this tool? https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#Run_profiler_igprof

davidlange6 commented 7 years ago

this doesn’t look like the sort of thing that should necessarily move to root…likely there is some similar functionality as newly added to TMVA but a small package based on eigen sounds like a nice solution as long as it remains supported. (and much easier to back port…)

On Dec 21, 2016, at 9:08 AM, Slava Krutelyov notifications@github.com wrote:

@mverzett I'm just curious about the expected lifetime of this external

Is the functionality going to be integrated with ROOT/TMVA or is it already there?

Do you have some performance tests (memory and CPU profiling) in using this tool? https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#Run_profiler_igprof

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

slava77 commented 7 years ago

On 12/21/16 6:24 AM, David Lange wrote:

this doesn’t look like the sort of thing that should necessarily move to root…likely there is some similar functionality as newly added to TMVA but a small package based on eigen sounds like a nice solution as long as it remains supported. (and much easier to back port…)

The main developer is a postdoc from ATLAS.

On Dec 21, 2016, at 9:08 AM, Slava Krutelyov notifications@github.com wrote:

@mverzett I'm just curious about the expected lifetime of this external

Is the functionality going to be integrated with ROOT/TMVA or is it already there?

Do you have some performance tests (memory and CPU profiling) in using this tool? https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#Run_profiler_igprof

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-268534555, or mute the thread https://github.com/notifications/unsubscribe-auth/AEdcbpr-I1AYDfMAC3C7Mw39h-IUG6gYks5rKTaTgaJpZM4LS1rE.

davidlange6 commented 7 years ago

yes, and? HEP people are not required to put their developments into root I hope.. but indeed, perhaps there is already a more scientific community standard solution that should be considered with the same functionality?

On Dec 21, 2016, at 9:37 AM, Slava Krutelyov notifications@github.com wrote:

On 12/21/16 6:24 AM, David Lange wrote:

this doesn’t look like the sort of thing that should necessarily move to root…likely there is some similar functionality as newly added to TMVA but a small package based on eigen sounds like a nice solution as long as it remains supported. (and much easier to back port…)

The main developer is a postdoc from ATLAS.

On Dec 21, 2016, at 9:08 AM, Slava Krutelyov notifications@github.com wrote:

@mverzett I'm just curious about the expected lifetime of this external

Is the functionality going to be integrated with ROOT/TMVA or is it already there?

Do you have some performance tests (memory and CPU profiling) in using this tool? https://twiki.cern.ch/twiki/bin/viewauth/CMS/RecoIntegration#Run_profiler_igprof

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-268534555, or mute the thread https://github.com/notifications/unsubscribe-auth/AEdcbpr-I1AYDfMAC3C7Mw39h-IUG6gYks5rKTaTgaJpZM4LS1rE.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

slava77 commented 7 years ago

On 12/21/16 6:40 AM, David Lange wrote:

yes, and? HEP people are not required to put their developments into root I hope.. but indeed, perhaps there is already a more scientific community standard solution that should be considered with the same functionality?

Is this package at least in the mainstream ATLAS software stack? recent commits there suggest that it's been adjusted for ATLAS.

As I mentioned in the cmssw issue, I'm not very fond of new parsers loaded in cmsRun. Based on the code, apparently we will need one in each thread.

The code in this package is lightweight, but there also apparently some similar code in tmva already (maybe it's my too superficial understanding of DNN support in TMVA).

mverzett commented 7 years ago

Hi All, Few answers:

I think that also @mstoye, @imarches, @carolinecollard, @JyothsnaKomaragiri are interested in the discussion

arizzi commented 7 years ago

As said in another thread, the tagger is supported by the POG (via ugly private recipe) for Moriond. So I propose a pratical approch:

I.e. let's quickly get this into the releases and then we can try to improve (no one from btv side has any interest in pushing for lwtnn over other techs, the need is rather to use keras/theano/tensorflow stuff with training typically done from python macros and evaluation typically needed in cmssw/c++)

I think long term strategy for Deep Learning stuff in cmssw can be discussed in some offline meeting/HN rather than on this thread.

Il 21 dic 2016 16:53, "Mauro Verzetti" notifications@github.com ha scritto:

Hi All, Few answers:

  • I do not have profiling results, I can try to have some in the next days with my brute force approach https://github.com/mverzett/cmssw/tree/DeepFlavour-from-CMSSW_8_0_21 (LWTNN included as a RecoBTag/ package)
  • I don't know if it is a mainstream package in ATLAS, I know that there is a lot of development ongoing about deep neural nets, nothing more.
  • I am not a TMVA expert, but by looking at their manual http://tmva.sourceforge.net/#mva_ann it looks like they do not support multi-class classification. Another issue with TMVA is that it is quite TMVA-centric, our training has been performed with Keras+TensorFlow, and finding the right tuning of the input file might be a problem.
  • If the possible lack of future support is the main issue here I would personally then move to full TensorFlow, which should also have a C++ API
  • Yes, at the moment we plan to run with a parsed json file. I have no idea how much memory/CPU it uses, though.

I think that also @mstoye https://github.com/mstoye, @imarches https://github.com/imarches, @carolinecollard https://github.com/carolinecollard, @JyothsnaKomaragiri https://github.com/JyothsnaKomaragiri are interested in the discussion

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-268557571, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyiloFcocGUBjZemI8_OqfMla0kY4jwks5rKUtlgaJpZM4LS1rE .

slava77 commented 7 years ago

On 12/22/16 11:45 AM, arizzi wrote:

As said in another thread, the tagger is supported by the POG (via ugly private recipe) for Moriond. So I propose a pratical approch:

  • we integrate the proposed code in the correct format (externals, no new cmssw packages, no private wget of json files) unless there are showstoppers
  • we decide about inclusion in (future) default reco or miniaod based on cpu/mem performance
  • we plan longer term strategy based on existing alternative if they do exists or when they will exist (i.e. we do not wait for TMVA to support keras, also considering that keras is a suggested platform from cern ML forum)

I.e. let's quickly get this into the releases and then we can try to improve (no one from btv side has any interest in pushing for lwtnn over other techs, the need is rather to use keras/theano/tensorflow stuff with training typically done from python macros and evaluation typically needed in cmssw/c++)

Fine with me.

As I mentioned earlier, the issue for me was mostly with having this in standard workflows in CMSSW (at least with understanding of how it works there and having it in long-term). That shouldn't really block this light-weight external.

I think long term strategy for Deep Learning stuff in cmssw can be discussed in some offline meeting/HN rather than on this thread.

Il 21 dic 2016 16:53, "Mauro Verzetti" notifications@github.com ha scritto:

Hi All, Few answers:

  • I do not have profiling results, I can try to have some in the next days with my brute force approach https://github.com/mverzett/cmssw/tree/DeepFlavour-from-CMSSW_8_0_21 (LWTNN included as a RecoBTag/ package)
  • I don't know if it is a mainstream package in ATLAS, I know that there is a lot of development ongoing about deep neural nets, nothing more.
  • I am not a TMVA expert, but by looking at their manual http://tmva.sourceforge.net/#mva_ann it looks like they do not support multi-class classification. Another issue with TMVA is that it is quite TMVA-centric, our training has been performed with Keras+TensorFlow, and finding the right tuning of the input file might be a problem.
  • If the possible lack of future support is the main issue here I would personally then move to full TensorFlow, which should also have a C++ API
  • Yes, at the moment we plan to run with a parsed json file. I have no idea how much memory/CPU it uses, though.

I think that also @mstoye https://github.com/mstoye, @imarches https://github.com/imarches, @carolinecollard https://github.com/carolinecollard, @JyothsnaKomaragiri https://github.com/JyothsnaKomaragiri are interested in the discussion

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-268557571, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyiloFcocGUBjZemI8_OqfMla0kY4jwks5rKUtlgaJpZM4LS1rE .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-268875731, or mute the thread https://github.com/notifications/unsubscribe-auth/AEdcbuirBmLCk39UN5nXmsA3Yms6bk-gks5rKtNXgaJpZM4LS1rE.

slava77 commented 7 years ago

@iahmad-khan just in case the discussion here have blocked the integration of the external, please proceed with its integration Thank you

iahmad-khan commented 7 years ago

@slava77 yes , sure

mverzett commented 7 years ago

@iahmad-khan can you let me know once done? I will then port my PR to the new environment and run the needed tests.

Thanks!

iahmad-khan commented 7 years ago

@mverzett yes i am currently testing/building it locally , facing some problems , seems that it has some issues related to eigen3 when building.

iahmad-khan commented 7 years ago

@mverzett Is there any documentation which version of boost and eigen it needs exactly?

mverzett commented 7 years ago

@iahmad-khan Not that I'm aware of. I managed to make it work in 8_0_21 without problems

iahmad-khan commented 7 years ago

@mverzett one other thing , there is no target: install , so after make all it builds some directories such as lib , bin , include , scripts , converters , etc. Which of these directories need to be distributed/needed in cmssw?

mverzett commented 7 years ago

@iahmad-khan , we need the classes and libraries defined in the include and src, which, I guess, are built in lib.

Thanks for your patience, it's the first time I make such request and I do not know what's really needed :).

iahmad-khan commented 7 years ago

@mverzett Merged in https://github.com/cms-sw/cmsdist/pull/2757

mverzett commented 7 years ago

@iahmad-khan thanks a lot! When is it supposed to appear in the IB?

arizzi commented 7 years ago

we need this in 80X too

iahmad-khan commented 7 years ago

@mverzett in the next DEVEL IB , @arizzi will do a backport today

davidlange6 commented 7 years ago

Hi @iahmad-khan - we'll test this in 90x mainstream (eg, pre3) before back porting especially given the eigen updates. Can you merge it into 90x once you see a successful devel build?

arizzi commented 7 years ago

@slava77 asked for an analysis release in 80X with POG recipes for Moriond ~now. This external is meant exactly for this. Given that mauro tested it successfully with the eigen version available in 80X I'm not understanding what the problem is.

slava77 commented 7 years ago

On 1/10/17 5:22 AM, arizzi wrote:

@slava77 https://github.com/slava77 asked for an analysis release in 80X with POG recipes for Moriond ~now. This external is meant exactly for this. Given that mauro tested it successfully with the eigen version available in 80X I'm not understanding what the problem is.

900pre3 time (Jan 16) was the target for the "~now".

Based on pre-winter-break notes in PC I understood that we will not have a complete enough set of POG Moriond recipes to make a release based on 900pre3.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-271574027, or mute the thread https://github.com/notifications/unsubscribe-auth/AEdcbm2C1VzDvcZiiv3qWlmlBhOoXXY0ks5rQ4YTgaJpZM4LS1rE.

mverzett commented 7 years ago

@iahmad-khan , is CMSSW_9_0_X_2017-01-11-1100 considered a DEVEL IB? Again, sorry for the naive question.

iahmad-khan commented 7 years ago

@mverzett No , you can either use scram or go to IBs page to see Devel IBs. For example CMSSW_9_0_DEVEL_X_2017-01-11-1100 is a recent Devel IB.

mverzett commented 7 years ago

@iahmad-khan , OK thanks!

mverzett commented 7 years ago

@iahmad-khan I have been trying to compile my code with the new external with CMSSW_9_0_DEVEL_X_2017-01-11-1100 and I get this error. It looks like something went wrong here.

Any help is appreciated. Thanks!

>> Compiling edm plugin /afs/cern.ch/work/m/mverzett/CMSSW_9_0_DEVEL_X_2017-01-11-1100/src/RecoBTag/Combined/plugins/DeepFlavourJetTagsProducer.cc 
In file included from /afs/cern.ch/work/m/mverzett/CMSSW_9_0_DEVEL_X_2017-01-11-1100/src/RecoBTag/Combined/plugins/DeepFlavourJetTagsProducer.cc:40:0:
/cvmfs/cms-ib.cern.ch/2017-02/slc6_amd64_gcc620/external/lwtnn/1.0/include/lwtnn/LightweightNeuralNetwork.hh:15:23: fatal error: Eigen/Dense: No such file or directory
 #include <Eigen/Dense>
                       ^
compilation terminated.
smuzaffar commented 7 years ago

@mverzett , can you please to add

<use name="eigen"/>

in your config/toolbox/slc6_amd64_gcc620/tools/selected/lwtnn.xml and then do scram setup lwtnn and try again. I think we are missing this dependency in the lwtnn tool definition.

mverzett commented 7 years ago

@smuzaffar Indeed it worked.

mverzett commented 7 years ago

@iahmad-khan , sorry to bother again, but trying to port my code to this external a get a mutex error that was not present in 80X:

----- Begin Fatal Exception 12-Jan-2017 18:09:45 CET-----------------------
An exception of category 'StdException' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=RecoJetDeltaRValueMapProducer label='oldJetMass'
Exception Message:
A std::exception was thrown.
boost: mutex lock failed in pthread_mutex_lock: Invalid argument
----- End Fatal Exception -------------------------------------------------

Is it due to something changed in the release or has to do with the external? I never really learned to have the stack trace dumped for C++ applications in case of exceptions, but I'm willing to help providing any info you might need.

Thanks again

arizzi commented 7 years ago

try igtrace

On Thu, Jan 12, 2017 at 6:25 PM, Mauro Verzetti notifications@github.com wrote:

@iahmad-khan https://github.com/iahmad-khan , sorry to bother again, but trying to port my code to this external a get a mutex error that was not present in 80X:

----- Begin Fatal Exception 12-Jan-2017 18:09:45 CET----------------------- An exception of category 'StdException' occurred while [0] Constructing the EventProcessor [1] Constructing module: class=RecoJetDeltaRValueMapProducer label='oldJetMass' Exception Message: A std::exception was thrown. boost: mutex lock failed in pthread_mutex_lock: Invalid argument ----- End Fatal Exception -------------------------------------------------

Is it due to something changed in the release or has to do with the external? I never really learned to have the stack trace dumped for C++ applications in case of exceptions, but I'm willing to help providing any info you might need.

Thanks again

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cms-sw/cmsdist/issues/2749#issuecomment-272226313, or mute the thread https://github.com/notifications/unsubscribe-auth/AEyiluIh1Y-Q5KZEAmZTVz6BHeTuwY2mks5rRmICgaJpZM4LS1rE .

Dr15Jones commented 7 years ago

Try running the job in the debugger gdb where you issue the debugging catch throw before running the actual job. This will cause the debugger to pause when the exception is thrown. At that point you can do a where to see the exact function thowing the exception.

mverzett commented 7 years ago

Running igtrace seems to point at some CMSSW library, another thing I wanted to ask before and I forgot is that the exception points to

Constructing module: class=RecoJetDeltaRValueMapProducer label='oldJetMass'

Which I did not touch at all in my code, nonetheless when I'm running the new b-taggers it crashes, without them (compiled and linked, but not executed) it does not

Dr15Jones commented 7 years ago

What was the exact stack trace?

I looked at RecoJetDeltaRValueMapProducer in CMSSW_8_0 and it doesn't look to be using anything from boost so I can't explain what you are seeing. Are you sure you've recompiled everything that depends on your change? If you did not, then this would lead to binary incompatibility which shows up as random crashes.

mverzett commented 7 years ago

You can find the trace here. I think I compiled all the dependencies, but I am not entirely sure. Is it still checkdeps to check out the dependencies of a (sub)package?

Dr15Jones commented 7 years ago

I don't know if checkdeps will find dependencies when an external is changed.

Dr15Jones commented 7 years ago

It looks like you used a fully rebuild release CMSSW_9_0_DEVEL_X_2017-01-10-1100 which is good since all CMSSW code should have been recompiled in that release.

mverzett commented 7 years ago

OK, let me try to recompile (it's going to take a while) and report back

mverzett commented 7 years ago

I tried a cleaning and recompiling, adding all the dependencies I could find. The result is still the same. Here you can find the gdb trace (with catch throw and where) and here you can find the same thing from igtrace.

Any kind of suggestion is more than welcome :)

smuzaffar commented 7 years ago

any recipe to reproduce it.

mverzett commented 7 years ago

OK, sorry it took me so long, but I wanted to try the recipe myself one more time to be sure everything was working

cmsrel CMSSW_9_0_DEVEL_X_2017-01-11-1100
cd CMSSW_9_0_DEVEL_X_2017-01-11-1100/src/
cmsenv
git cms-init
#patch the release external
sed -i 's|</tool>|  <use name="eigen"/>\n</tool>|g' ../config/toolbox/slc6_amd64_gcc620/tools/selected/lwtnn.xml
scram setup lwtnn
#get DeepFlavour code
git cms-merge-topic mverzett:DeepFlavourPR-from-CMSSW_9_0_X
#This takes ages, you can use more cores if you are not on lxplus, on lxplus with 4 cores it crashes, I guess it runs out of memory
scram b -j 2
cd RecoBTag/Combined/test/
cmsRun testDeepCSV_cfg.py

Let me know if you can reproduce the problem

Thanks

smuzaffar commented 7 years ago

where can I find RecoBTag/Combined/data/DeepFlavourNoSL.json

edm::FileInPath unable to find file RecoBTag/Combined/data/DeepFlavourNoSL.json anywhere in the search path
mverzett commented 7 years ago

D'OH! I'm terribly sorry.

You can find the training files here

mkdir RecoBTag/Combined/data
cd RecoBTag/Combined/data
wget http://home.fnal.gov/~verzetti//DeepFlavour/training/DeepFlavourNoSL.json
wget http://mon.iihe.ac.be/~smoortga/DeepFlavour/CMSSW_implementation_DeepCMVA/Model_DeepCMVA.json
cd -
smuzaffar commented 7 years ago

@mverzett , I can reproduce the error but could not find out the reason behind it.

Dr15Jones commented 7 years ago

Maybe valgrind could help?

smuzaffar commented 7 years ago

valgrind shows the below which I will try to digest later :-)

==8309== Invalid write of size 8
==8309==    at 0x3CEE409289: __pthread_mutex_lock_full (in /lib64/libpthread-2.12.so)
==8309==    by 0x18A19437: pthread_mutex_lock (mutex.hpp:62)
==8309==    by 0x18A19437: lock (mutex.hpp:116)
==8309==    by 0x18A19437: boost::unique_lock<boost::mutex>::lock() (lock_types.hpp:346)
==8309==    by 0x18A1966D: unique_lock (lock_types.hpp:124)
==8309==    by 0x18A1966D: acquire (object_with_id.ipp:102)
==8309==    by 0x18A1966D: boost::spirit::classic::impl::object_with_id_base<boost::spirit::classic::impl::grammar_tag, unsigned long>::acquire_object_id() (object_with_id.ipp:152)
==8309==    by 0x18A130BB: object_with_id (object_with_id.ipp:78)
==8309==    by 0x18A130BB: grammar (grammar.hpp:51)
==8309==    by 0x18A130BB: Grammar (Grammar.h:68)
==8309==    by 0x18A130BB: reco::parser::expressionParser(edm::TypeWithDict const&, std::string const&, boost::shared_ptr<reco::parser::ExpressionBase>&, bool) (expressionParser.cc:9)
==8309==    by 0x2FDAFCCC: expressionParser<reco::Jet> (expressionParser.h:14)
==8309==    by 0x2FDAFCCC: StringObjectFunction<reco::Jet, false>::StringObjectFunction(std::string const&, bool) (StringObjectFunction.h:19)
==8309==    by 0x2FDB280F: JetDeltaRValueMapProducer<reco::Jet, reco::Jet>::JetDeltaRValueMapProducer(edm::ParameterSet const&) (JetDeltaRValueMapProducer.cc:50)
==8309==    by 0x2FDB2ED3: makeStreamModule<JetDeltaRValueMapProducer<reco::Jet> > (makeGlobal.h:48)
==8309==    by 0x2FDB2ED3: operator() (ProducingModuleAdaptor.h:75)
==8309==    by 0x2FDB2ED3: createStreamModules<edm::stream::ProducingModuleAdaptor<T, M, B>::setupStreamModules() [with T = JetDeltaRValueMapProducer<reco::Jet>; M = edm::stream::EDProducerBase; B = edm::stream::EDProducerAdaptorBase]::<lambda()> > (ProducingModuleAdaptorBase.h:103)
==8309==    by 0x2FDB2ED3: edm::stream::ProducingModuleAdaptor<JetDeltaRValueMapProducer<reco::Jet, reco::Jet>, edm::stream::EDProducerBase, edm::stream::EDProducerAdaptorBase>::setupStreamModules() (ProducingModuleAdaptor.h:74)
==8309==    by 0x4C30B11: edm::stream::ProducingModuleAdaptorBase<edm::stream::EDProducerBase>::doPreallocate(edm::PreallocationConfiguration const&) (ProducingModuleAdaptorBase.cc:59)
==8309==    by 0x4B9F1CD: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (WorkerMaker.cc:96)
==8309==    by 0x4B4C7E6: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (Factory.cc:73)
==8309==    by 0x4C1D8EE: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (ModuleRegistry.cc:29)
==8309==    by 0x4B95708: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (WorkerRegistry.cc:49)
==8309==    by 0x4BBFBB8: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (WorkerManager.cc:44)
==8309==    by 0x4BC041E: edm::WorkerManager::addToUnscheduledWorkers(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration>, std::string, std::set<std::string, std::less<std::string>, std::allocator<std::string> >&, std::vector<std::string, std::allocator<std::string> >&) (WorkerManager.cc:59)
==8309==    by 0x4B158DD: edm::StreamSchedule::StreamSchedule(std::shared_ptr<edm::TriggerResultInserter>, std::shared_ptr<edm::ModuleRegistry>, edm::ParameterSet&, edm::service::TriggerNamesService&, edm::PreallocationConfiguration const&, edm::ProductRegistry&, edm::BranchIDListHelper&, edm::ExceptionToActionTable const&, std::shared_ptr<edm::ActivityRegistry>, std::shared_ptr<edm::ProcessConfiguration>, bool, edm::StreamID, edm::ProcessContext const*) (StreamSchedule.cc:216)
==8309==    by 0x4B29E53: edm::Schedule::Schedule(edm::ParameterSet&, edm::service::TriggerNamesService&, edm::ProductRegistry&, edm::BranchIDListHelper&, edm::ThinnedAssociationsHelper&, edm::ExceptionToActionTable const&, std::shared_ptr<edm::ActivityRegistry>, std::shared_ptr<edm::ProcessConfiguration>, bool, edm::PreallocationConfiguration const&, edm::ProcessContext const*) (new_allocator.h:120)
==8309==    by 0x4B8C44F: edm::ScheduleItems::initSchedule(edm::ParameterSet&, bool, edm::PreallocationConfiguration const&, edm::ProcessContext const*) (unique_ptr.h:787)
==8309==    by 0x4BD8A86: edm::EventProcessor::init(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) (EventProcessor.cc:542)
==8309==    by 0x4BDB2B7: edm::EventProcessor::EventProcessor(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) (EventProcessor.cc:354)
==8309==    by 0x40C4A0: main::{lambda()#1}::operator()() const (unique_ptr.h:787)
==8309==  Address 0x2d8dbdc0 is 0 bytes after a block of size 32 alloc'd
==8309==    at 0x40291FC: operator new(unsigned long) (in /cvmfs/cms-ib.cern.ch/2017-02/slc6_amd64_gcc620/external/valgrind/3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8309==    by 0x1D41B06D: acquire_object_id (object_with_id.ipp:148)
==8309==    by 0x1D41B06D: object_with_id (object_with_id.ipp:78)
==8309==    by 0x1D41B06D: grammar (grammar.hpp:51)
==8309==    by 0x1D41B06D: json_grammar (json_parser_read.hpp:159)
==8309==    by 0x1D41B06D: void boost::property_tree::json_parser::read_json_internal<boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> > >(std::basic_istream<boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::key_type::value_type, std::char_traits<boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::key_type::value_type> >&, boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >&, std::string const&) (json_parser_read.hpp:310)
==8309==    by 0x1D40E6FD: read_json<boost::property_tree::basic_ptree<std::basic_string<char>, std::basic_string<char> > > (json_parser.hpp:45)
==8309==    by 0x1D40E6FD: lwt::parse_json(std::istream&) (parse_json.cxx:28)
==8309==    by 0x1D1CE15B: DeepFlavourJetTagsProducer::DeepFlavourJetTagsProducer(edm::ParameterSet const&) (DeepFlavourJetTagsProducer.cc:107)
==8309==    by 0x1D1D73A3: makeStreamModule<DeepFlavourJetTagsProducer> (makeGlobal.h:48)
==8309==    by 0x1D1D73A3: operator() (ProducingModuleAdaptor.h:75)
==8309==    by 0x1D1D73A3: createStreamModules<edm::stream::ProducingModuleAdaptor<T, M, B>::setupStreamModules() [with T = DeepFlavourJetTagsProducer; M = edm::stream::EDProducerBase; B = edm::stream::EDProducerAdaptorBase]::<lambda()> > (ProducingModuleAdaptorBase.h:103)
==8309==    by 0x1D1D73A3: edm::stream::ProducingModuleAdaptor<DeepFlavourJetTagsProducer, edm::stream::EDProducerBase, edm::stream::EDProducerAdaptorBase>::setupStreamModules() (ProducingModuleAdaptor.h:74)
==8309==    by 0x4C30B11: edm::stream::ProducingModuleAdaptorBase<edm::stream::EDProducerBase>::doPreallocate(edm::PreallocationConfiguration const&) (ProducingModuleAdaptorBase.cc:59)
==8309==    by 0x4B9F1CD: edm::Maker::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (WorkerMaker.cc:96)
==8309==    by 0x4B4C7E6: edm::Factory::makeModule(edm::MakeModuleParams const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) const (Factory.cc:73)
==8309==    by 0x4C1D8EE: edm::ModuleRegistry::getModule(edm::MakeModuleParams const&, std::string const&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&, edm::signalslot::Signal<void (edm::ModuleDescription const&)>&) (ModuleRegistry.cc:29)
==8309==    by 0x4B95708: edm::WorkerRegistry::getWorker(edm::WorkerParams const&, std::string const&) (WorkerRegistry.cc:49)
==8309==    by 0x4BBFBB8: edm::WorkerManager::getWorker(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration const>, std::string const&) (WorkerManager.cc:44)
==8309==    by 0x4BC041E: edm::WorkerManager::addToUnscheduledWorkers(edm::ParameterSet&, edm::ProductRegistry&, edm::PreallocationConfiguration const*, std::shared_ptr<edm::ProcessConfiguration>, std::string, std::set<std::string, std::less<std::string>, std::allocator<std::string> >&, std::vector<std::string, std::allocator<std::string> >&) (WorkerManager.cc:59)
==8309==    by 0x4B158DD: edm::StreamSchedule::StreamSchedule(std::shared_ptr<edm::TriggerResultInserter>, std::shared_ptr<edm::ModuleRegistry>, edm::ParameterSet&, edm::service::TriggerNamesService&, edm::PreallocationConfiguration const&, edm::ProductRegistry&, edm::BranchIDListHelper&, edm::ExceptionToActionTable const&, std::shared_ptr<edm::ActivityRegistry>, std::shared_ptr<edm::ProcessConfiguration>, bool, edm::StreamID, edm::ProcessContext const*) (StreamSchedule.cc:216)
==8309==    by 0x4B29E53: edm::Schedule::Schedule(edm::ParameterSet&, edm::service::TriggerNamesService&, edm::ProductRegistry&, edm::BranchIDListHelper&, edm::ThinnedAssociationsHelper&, edm::ExceptionToActionTable const&, std::shared_ptr<edm::ActivityRegistry>, std::shared_ptr<edm::ProcessConfiguration>, bool, edm::PreallocationConfiguration const&, edm::ProcessContext const*) (new_allocator.h:120)
==8309==    by 0x4B8C44F: edm::ScheduleItems::initSchedule(edm::ParameterSet&, bool, edm::PreallocationConfiguration const&, edm::ProcessContext const*) (unique_ptr.h:787)
==8309==    by 0x4BD8A86: edm::EventProcessor::init(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) (EventProcessor.cc:542)
==8309==    by 0x4BDB2B7: edm::EventProcessor::EventProcessor(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) (EventProcessor.cc:354)
==8309==    by 0x40C4A0: main::{lambda()#1}::operator()() const (unique_ptr.h:787)
==8309==    by 0x40A98C: wrap<main(int, char**)::<lambda()> > (ConvertException.h:20)
==8309==    by 0x40A98C: main (cmsRun.cpp:140)
Dr15Jones commented 7 years ago

Was BOOST_SPIRIT_THREADSAFE defined when lwtnn was compiled? The function which allocates the memory is using that switch and we definitely compile all other uses of BOOST with that defined.

mverzett commented 7 years ago

@Dr15Jones How can I check it? If not, how can I add it in a dev area?