Open aseshkdatta opened 3 years ago
Hi Asesh,
@tegonzalo has been on holiday but I think he'll be back online at the start of next week and might be able to help out. In the meantime, can you try installing a different MPI library please, and also posting the the log files so that we can see where the segmentation fault seems to have occurred?
Thanks - Pat
Hi Pat,
Thank you so much for the information and for your suggestion. Right now, I am working with our computer admin on the issue and is going to take up soon what you have suggested.
Cheers. Asesh
Hi Asesh,
@tegonzalo has been on holiday but I think he'll be back online at the start of next week and might be able to help out. In the meantime, can you try installing a different MPI library please, and also posting the the log files so that we can see where the segmentation fault seems to have occurred?
Thanks - Pat
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-888809191
Hi Pat,
Just a curiosity.... can it be a memory (swap) related issue as some pages on the internet suggest?
I am for sure working on a low memory (for the purpose) machine (8 GB RAM with 16 GB swap).
However, with the same availability/allocation of memory on another machine, the test 'mpirun' of Gambit went smooth. With this in mind, your suggestion to use a different MPI library appears crucial.
Cheers. Asesh
Hi Pat,
Thank you so much for the information and for your suggestion. Right now, I am working with our computer admin on the issue and is going to take up soon what you have suggested.
Cheers. Asesh
Hi Asesh,
@tegonzalo has been on holiday but I think he'll be back online at the start of next week and might be able to help out. In the meantime, can you try installing a different MPI library please, and also posting the the log files so that we can see where the segmentation fault seems to have occurred?
Thanks - Pat
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-888809191
============================= AseshKrishna Datta Professor 'H' Theoretical High Energy Physics Group Harish-Chandra Research Institute (HRI) (Department of Atomic Energy, Govt. of India) Allahabad (Prayagraj) UP INDIA 211019
Hi Pat,
Even a serial run on this machine is returning the following.
===================================== @.***:~/MyPlace/Packages/gambit_2.0$ ./gambit -f yaml_files/MDMSM_Tute.yaml
GAMBIT 2.0.0 http://gambit.hepforge.org
Anything else to worry about before we go for a different MPI library?
Cheers. Asesh
Hi Asesh,
@tegonzalo has been on holiday but I think he'll be back online at the start of next week and might be able to help out. In the meantime, can you try installing a different MPI library please, and also posting the the log files so that we can see where the segmentation fault seems to have occurred?
Thanks - Pat
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-888809191
In that case, please re-cmake and re-compile with MPI support completely turned off (-DWITH_MPI=OFF), confirm that you still still get the error, and send us the single log from that run. It seems like the issue is probably some other library (not MPI). I think it unlikely that it is memory related, as GAMBIT takes much less memory to run than to compile.
Thanks a lot, Pat. I'll soon try that out.
Cheers. Asesh
In that case, please re-cmake and re-compile with MPI support completely turned off (-DWITH_MPI=OFF), confirm that you still still get the error, and send us the single log from that run. It seems like the issue is probably some other library (not MPI). I think it unlikely that it is memory related, as GAMBIT takes much less memory to run than to compile.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889057612
Hi,
Should I use "-DWITH_MPI=OFF" when building GUM or I should build Gambit itself with this option first?
Thanks. Asesh
In that case, please re-cmake and re-compile with MPI support completely turned off (-DWITH_MPI=OFF), confirm that you still still get the error, and send us the single log from that run. It seems like the issue is probably some other library (not MPI). I think it unlikely that it is memory related, as GAMBIT takes much less memory to run than to compile.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889057612
Hi Pat,
I rebuilt both Gambit and GUM with re-cmake (with -DWITH_MPI=OFF at all stages) and re-make.
@.***:~/Packages/gambit_2.0$ ./gambit -f yaml_files/MDMSM_Tute.yaml
GAMBIT 2.0.0 http://gambit.hepforge.org
I did not quite understand which log file you meant by "single log from that run". I am attaching herewith the following two log files. ./build/CMakeFiles/CMakeError.log ./build/CMakeFiles/CMakeOutput.log (Similar log files under the same names are found in ./gum/build/CMakeFiles/CMakeError.log ./gum/build/CMakeFiles/CMakeOutput.log)
Please let me know if you were looking for a different log/output.
Thanks. Asesh
In that case, please re-cmake and re-compile with MPI support completely turned off (-DWITH_MPI=OFF), confirm that you still still get the error, and send us the single log from that run. It seems like the issue is probably some other library (not MPI). I think it unlikely that it is memory related, as GAMBIT takes much less memory to run than to compile.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889057612
Also, when I face the following message due to a previous `make', how should I proceed; simply proceed pressing 'enter'?
This one indicates that something has gone wrong in the download or build of micromegas. In this case you need to nuke micromegas and try building it again: nuke micromegas_MDMSM; make micromegas_MDMSM
.
As to the logs, what I mean is the default log in the output directory of the GAMBIT run that you are trying to launch, i.e runs/MDMSM/logs/default.log
. If that doesn't exist, then please send scratch/default.log
.
To attach files, I think you need to log into GitHub (the ones you apparently attached by email don't seem to have come through).
Thanks a lot, Pat.
I'll get back to you soon.
Cheers. Asesh
Also, when I face the following message due to a previous `make', how should I proceed; simply proceed pressing 'enter'?
This one indicates that something has gone wrong in the download or build of micromegas. In this case you need to nuke micromegas and try building it again: `nuke micromegas_MDMSM; make micromegas_MDMSM'.
As to the logs, what I mean is the default log in the output directory of the GAMBIT run that you are trying to launch, i.e
runs/MDMSM/logs/default.log'. If that doesn't exist, then please send
scratch/default.log`.To attach files, I think you need to log into GitHub (the ones you apparently attached by email don't seem to have come through).
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889585804
Thanks a lot for the suggestions. I am writing below.
As to the logs, what I mean is the default log in the output directory of the GAMBIT run that you are trying to launch, i.e
runs/MDMSM/logs/default.log'. If that doesn't exist, then please send
scratch/default.log`.
==> No 'runs' folder is created under gambit home. A 'scratch' folder is found which contains two sub-folder: 'build_time' and 'run_time', none of which contains anything inside.
To attach files, I think you need to log into GitHub (the ones you apparently attached by email don't seem to have come through).
==> Attaching a screen-shot of the run showing the subsequent folder details.
Please observe and kindly suggest me how to proceed.
Regards. Asesh
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889585804
Hi,
I am so sorry. By mistake, I attached the screenshot with my previous mail. I am attaching it on the github gambit page.
Regards. Asesh
Thanks a lot for the suggestions. I am writing below.
As to the logs, what I mean is the default log in the output directory of the GAMBIT run that you are trying to launch, i.e
runs/MDMSM/logs/default.log'. If that doesn't exist, then please send
scratch/default.log`.==> No 'runs' folder is created under gambit home. A 'scratch' folder is found which contains two sub-folder: 'build_time' and 'run_time', none of which contains anything inside.
To attach files, I think you need to log into GitHub (the ones you apparently attached by email don't seem to have come through).
==> Attaching a screen-shot of the run showing the subsequent folder details.
Please observe and kindly suggest me how to proceed.
Regards. Asesh
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-889585804
============================= AseshKrishna Datta Professor 'H' Theoretical High Energy Physics Group Harish-Chandra Research Institute (HRI) (Department of Atomic Energy, Govt. of India) Allahabad (Prayagraj) UP INDIA 211019
Attached the mentioned screen-shot. Thanks.
Hi Asesh - OK, maybe we should start with your full cmake output. In your build dir, please run
make nuke-all
rm -rf *
cmake [your cmake options] ...
and post the output here.
Hi Pat,
Thanks a lot.
After executing "make nuke-all" and "rm -rf *" from the "build" dir, I ran "cmake -DWITH_MPI=OFF .." as suggested by you earlier. cmake-log1.txt
I am attaching herewith the output of cmake. Asesh
Hi Asesh,
Apologies for the silence, I was on holidays. Thanks Pat for taking over.
I am attaching herewith the output of cmake.
Your cmake output seems fine. MPI is correctly disabled and everything else seems to have configured correctly. Please build now gambit and the relevant backends like this
make micromegas_MDMSM
make calchep
make -j<n> gambit
Once that is finished, if there are no errors, then run a simple test run, first using the spartan
yaml file, i.e.
./gambit -f yaml_files/spartan.yaml
and if that works, then run the MDMSM.yaml
file.
After every scan, a corresponding folder should have been created in the runs directory, e.g. runs/spartan
, with logs, samples and scanner info. If any of the scans above fail but you still have the logs in that directory, please send them so we can have a look at it.
Cheers, Tomas
Hi Tomas,
Thanks for the mail.
asesh@albert:~/Packages/gambit_2.0$ ./gambit -f yaml_files/spartan.yaml
GAMBIT 2.0.0 http://gambit.hepforge.org
Please let me know how to proceed under the circumstances.
Cheers. Asesh
That is really unusal. And you said that there is nothing either on scratch/run_time
, right? Could you run the following commands and tell me what you see?
First just do
./gambit
If everything is fine, that should print usage instructions, but it requires creating the scratch files, so you will probably see the segfault too. After that also do
./gambit modules
which should give you the list of built modules, but again, may fail if the problem is due to the scratch files.
Let me know what you see on both cases.
Thanks for you help!
Tomas
asesh@albert:~/Packages/gambit_2.0$ asesh@albert:~/Packages/gambit_2.0$ ./gambit
GAMBIT 2.0.0 http://gambit.hepforge.org
Segmentation fault (core dumped) asesh@albert:~/Packages/gambit_2.0$ asesh@albert:~/Packages/gambit_2.0$ asesh@albert:~/Packages/gambit_2.0$ ./gambit modules
GAMBIT 2.0.0 http://gambit.hepforge.org
Cheers. Asesh
Right; "scratch/run_time" folder has nothing created inside it!
Asesh
Dear Tomas,
Could you have a look into the issue? Kindly share your observations.
Cheers. Asesh
That is really unusal. And you said that there is nothing either on
scratch/run_time
, right? Could you run the following commands and tell me what you see?First just do
./gambit
If everything is fine, that should print usage instructions, but it requires creating the scratch files, so you will probably see the segfault too. After that also do
./gambit modules
which should give you the list of built modules, but again, may fail if the problem is due to the scratch files.
Let me know what you see on both cases.
Thanks for you help!
Tomas
-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-891139542
Hi Asesh,
I've been trying to figure out what could be happening to produce a segfault that early on the run, because it doesn't make sense as this has never happened before. Could you show me the backtrace of the segfault? Launch the debugger
gdb ./gambit
and then inside the debugger just run
r modules
once you get the segfault then write
backtrace
and show me what you get.
Also run
r models
just to make sure it has nothing to do with the yaml reader
Thanks again for your help in resolving this.
Thanks, Tomas. Here I am attaching the output.
Asesh
Hi Tomas,
Does it look like a Mathematica WSTP issue?
You may perhaps remember (shared with Pat also a few weeks ago) that there was an issue with the shared WSTP library (broken) on Mathematica 10 which I am having on the present computer.
To circumvent that I replaced the entry "libWSTP64i4.so" in the file "build/CMakeCache.txt" file by "libWSTP64i4.a" as shown below.
//WSTP library to link against. Mathematica_WSTP_LIBRARY:FILEPATH=/usr/local/Wolfram/Mathematica/10.0/SystemFiles/Links/WSTP/DeveloperKit/Linux-x86-64/CompilerAdditions/libWSTP64i4.a
The build went smoothly with this replacement and I was asked to post this solution on github. However, since I couldn't yet run gambit successfully, I postponed posting the solution.
Kindly take a close look into this aspect as well.
Cheers. Asesh
Also run
r models
just to make sure it has nothing to do with the yaml reader
Thanks again for your help in resolving this.
-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-893371638
Ah, yes, it is definitely a Mathematica issue. Fortunately to run gambit you don't need Mathematica, you only need it to run gum. So when compiling gambit you could do
cmake -Ditch="Mathematica" ..
and then you can recompile with make
and try running it. With this you will (hopefully) get gambit working.
Okay, Tomas. Let me try it out by ditching Mathematica.
Will let you know. Asesh
Ah, yes, it is definitely a Mathematica issue. Fortunately to run gambit you don't need Mathematica, you only need it to rum gum. So when compiling gambit you could do
cmake -Ditch="Mathematica" ..
and then you can recompile with
make
and try running it. With this you will (hopefully) get gambit working.-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-893436070
Right, Tomas. By ditching Mathematica, gambit runs without a segfault. But, as you say, I cannot build gum.
Could you suggest a work-around with Mathematica 10 so that I could keep it in the build process and thereof successfully work with gum.
As we find, my earlier naive fix of replacing the shared WSTP library with its static counterpart didn't finally help!
Thanks. Asesh
Ah, yes, it is definitely a Mathematica issue. Fortunately to run gambit you don't need Mathematica, you only need it to rum gum. So when compiling gambit you could do
cmake -Ditch="Mathematica" ..
and then you can recompile with
make
and try running it. With this you will (hopefully) get gambit working.-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/1#issuecomment-893436070
I think it's because you are changing the location of the library in CMakeCache.txt
. That file is automatically generated when you run cmake
, so any changes to that file will be ignored. But your suggestion of using the static library should work anyway, you just need to pass it as a cmake flag. So when building gum run
cmake -DMathematica_WSTP_LIBRARY=<path_to_your_ibWSTP64i4.a> ..
and then make gum with make
.
Hi Tomas,
Thanks for your observations.
Using "cmake -Ditch="Mathematica" .. " during gambit build and employing
cmake -DMathematica_WSTP_LIBRARY=/usr/local/Wolfram/Mathematica/10.0/SystemFiles/Links/WSTP/DeveloperKit/Linux-x86-64/CompilerAdditions/libWSTP64i4.a ..
during gum build, I end up with the following error message of mpirun of gambit. Segfault is not appearing though any more.
I followed the directives given in the bottom of the right column on page 19 of arXiv:2107.00030 for the MDMSM model and hence only built "micromegas_MDMSM", "calchep", "gamlike" and "ddcalc". The message is talking about DarkSUSY.
Could you please shed some light.
Thanks. Asesh gambit-MDMSM-log.txt
That seems to imply that you need to also build darksusy
. I don't think that is intended, as you already have micromegas
. Let me investigate further. But if you want to continue with the testing, just make
make darksusy_generic_wimp
According to the version of MDMSM_Tute.yaml
that made it to version 2.0.0 of gambit, it is intended as the function SimYieldTable_DarkSUSY
is chosen to fulfil the capability SimYieldTable
for the gamma ray yields that are required by lnL_FermiLATdwarfs
.
It might be a typo and it should be SimYieldTable_MicrOmegas
. In that case there should be no need for darksusy as far as I see it.
Yes, I just figure out that myself too. @aseshkdatta please just change SimYieldTable_DarkSUSY
for SimYieldTable_MicrOmegas
in MDMSM_Tute.yaml
and try with that.
I will modify that on master so that it gets out in the next release
Hi, it's giving an error saying
Error reading Inifile "yaml_files/MDMSM_Tute.yaml"! Please check that file exist! (yaml-cpp error: yaml-cpp: error at line 127, column 17: illegal map value )
Attaching the full message herewith.
Asesh gambit-run-yaml-file-with-SimYieldTable_MicrOmegas.txt
Can you send the yaml file so that I can see where the error is?
Here is the yaml file. I added ".txt" extension so that it can be uploaded to this page.
Asesh
Ok, in line 127 you need to make sure that the indentations match. So function
should just fall below capability
, but just using spaces, not tabs.
Thanks for pointing that out. I corrected that and ran gambit again. The error message looks like the following. The detailed one is attached herewith.
Asesh
/home/asesh/Packages/gambit_2.0/Backends/installed/calchep/3.6.27/sbin/newProcess: 87: exit: Illegal number: -1 Likelihood contribution from DarkBit::lnL_oh2_upperlimit: -196.799 Likelihood contribution from DarkBit::LUX_2016_GetLogLikelihood: -1.46732 Likelihood contribution from DarkBit::XENON1T_2018_GetLogLikelihood: -3.64953 PROCESS: ~chi,~chi -> mu+,mu- This particle is absent in the model Can not compile ~chi,~chi -> mu+,mu- [albert:09020] Process received signal [albert:09020] Signal: Segmentation fault (11) [albert:09020] Signal code: Address not mapped (1) [albert:09020] Failing at address: 0x28 [albert:09020] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f184caa7980] [albert:09020] [ 1] /home/asesh/Packages/gambit_2.0/Backends/installed/calchep/3.6.27/lib/libcalchep.so(passParameters+0x73)[0x7f17bd6b1fdf] [albert:09020] [ 2] gambit(+0x15ecb18)[0x55f206ebbb18] gambit-run-corrected-yaml-file-SimYieldTable_MicrOmegas.txt
It looks like something is wrong with calchep. Please try nuking it and rebuilding it again
make nuke-calchep
make calchep
Thanks, Tomas. Do I need to make gambit after that?
Asesh
No, just calchep this time
Here is the error message:
theta12: 0.58376
theta13: 0.15495
theta23: 0.76958
nuclear_params_sigmas_sigmal: deltad: -0.427 deltas: -0.085 deltau: 0.842 sigmal: 58 sigmas: 43
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 1.
rank 2: FinalizeWithTimeout failed to sync for clean MPI shutdown, calling MPI_Abort... rank 2: Issuing MPI_Abort command, attempting to terminate all processes... [albert:11423] 1 more process has sent help message help-mpi-api.txt / mpi-abort [albert:11423] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Can you send me the default log? (runs/MDMSM/logs/default.log)
All 4 of them? Or one will suffice?
Asesh
From your message above, it looks like rank 1 was the one that found the error, so send me rank 1
Here it is.... default.log_1.txt
Hi,
The example mpirun of gambit as suggested on page 21 of the GUM paper on arXiv (2107.00030) is leading to segmentation fault on a particular (desktop) machine. The message is as follows.
Could you please help.
Thanks and regards. Asesh
================================================= asesh@albert:~/MyPlace/Packages/gambit_2.0$ time mpirun -n 4 gambit -f yaml_files/MDMSM_Tute.yaml
GAMBIT 2.0.0 http://gambit.hepforge.org
GAMBIT 2.0.0 http://gambit.hepforge.org
GAMBIT 2.0.0 http://gambit.hepforge.org
GAMBIT 2.0.0 http://gambit.hepforge.org
mpirun noticed that process rank 3 with PID 0 on node albert exited on signal 11 (Segmentation fault).
real 0m5.359s user 0m6.412s sys 0m0.089s asesh@albert:~/MyPlace/Packages/gambit_2.0$