Open pcanal opened 3 days ago
Hi @pcanal , thanks for this report. Hopefully the solution will help also with fewer threads.
I am not sure though that the unresolved while linking
is due to the high thread count. Can you confirm that you do not see these errors with 8-16 threads?
I am not sure though that the unresolved while linking is due to the high thread count.
I think you might be right. The best way forward is to track down where those missing symbol are suppose to come from.
Thanks for the comment. At this point this issue seems to conflate two things:
If 1. is confirmed to be solved, I would say that at least this issue ought to be closed and one about missing symbols opened. However, even if an issue dedicated to the missing symbols is opened, it's not clear, at least to me, how the problem can be reproduced. So far we have no indication of it in our CI: can it be due to a somewhat imprecise formulation of the python dependencies in the requirements.txt
file that affects your platform?
Check duplicate issues.
Description
On a large node (127 cores, 128 GB), I ran:
After 1. many test failes due to lack of resources (running out of threads, see #16552 ):
However in 2., several tests still failed (even-though resources where no longer an issue):
The errors listed there included:
From this I conclude that those tests (in particular
TMVA_SOFIE_RDataFrame.C
andtutorials/tmva/TMVA_SOFIE_GNN_Application.C
) are missing a dependencies that failed in the first run.Note
tutorial-tmva-TMVA_SOFIE_Keras_HiggsModel
andtutorial-tmva-TMVA_SOFIE_RDataFrame-py
are indeed needingTMVA_Higgs_Classification.C
to run first (it says so in the output! :) ).tutorial-tmva-TMVA_SOFIE_RSofieReader
is asking forHiggs_trained_model.h5
gtest-tmva-pymva-test-TestRModelParserKeras
is missing the symbolsgemm_
(see below)However when rerunning (where this time somehow there was no resource related failures), I still got several failures:
all due to:
or both
Which may be due to either a badly formed result of the failing run (1) or due to an external package that does not have the correct version number?
Reproducer
ROOT version
master
Installation method
hand build
Operating system
Alma9
Additional context