Closed hrbeagle closed 7 years ago
OK - I haven't been able to reproduce this error and so currently can't debug it directly - am looking at re-compiling with different gdb flags to get a more informative stack trace.
Also, it might be a good idea to collect data: @adelave1 - from other issues your os is OS X El Capitan, Version 10.11.6, please can you let me know the docker toolbox version you're using as well as the version of VirtualBox?
As for my own system, I'm also using OS X El Capitan, Version 10.11.6 and cannot reproduce the problem using Docker platform Version 1.12.0-rc4-beta20
I will try and reproduce the error using the exact version combination.
@eclake - I'm using the Docker platform, Version 1.12.0-beta22, and VirtualBox 5.1.2.
By following the backtrace, it looks like the error arises from these lines of code, in MultiNest
allocate(work(lwork),iwork(liwork))
call DSYEVR( 'V', 'A', 'U', n, a, n, vl, vu, il, iu, &
abstol, m, diag, Z, n, isuppz, work, lwork, iwork, liwork, ierr )
a=z
deallocate(work,iwork)
And the subroutine DSYEVR
is a LAPACK routine
I never saw such error (Illegal instructions
), though, but quoting from this page
An illegal instruction error normally means you are using a version of gfortran on the wrong architecture (ie., i686 on amd64).
The reason may be the use of OpenBLAS instead of a standard LAPACK/BLAS library...OpenBLAS optimizes LAPACK on a particular architecture, so, perhaps, this creates issues on different architectures...
@eclake Since we are planning (tonight) a new Docker-Beagle release, it may be worth linking vs LAPACK, instead of OpenBLAS, leaving OpenBLAS only when we compile Beagle.
The best way to proceed for this would be:
libopenblas
and the corresponding header files (cblas.h f77blas.h openblas_config.h
);install LAPACK with homebrew, if you type
brew info homebrew/dupes/lapack
you should get something like
homebrew/dupes/lapack: stable 3.6.0 (bottled) [keg-only]
Linear Algebra PACKage
http://www.netlib.org/lapack/
Not installed
From: https://github.com/Homebrew/homebrew-dupes/blob/master/lapack.rb
then you can simply run brew install homebrew/dupes/lapack
or brew install lapack
liblapack.a
) to the <install>/lib
folder, renaming liblapack.a
---> libopenblas.a
<install>/include
Let me know if you need any help!!
Hi @adelave1 - I have pushed a new docker image to dockerhub following @jacopo-chevallard 's suggestions - you can pull it with
docker login
docker pull eclake/beagle:0.5.7_test
and then when you run please also use the full name plus version for the docker image.
Please could you try running this again with the new image to see if it solves the problem? Thanks!
@rmastand can you try to above steps as well?
@eclake, sorry for the delay! Using beagle:0.5.7_test
, I get the same type of error for both fit_spectrum_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test 1 /opt/BEAGLE/params/fit_spectrum_example.param
All the templates read.
---> fixed sfh_type 0
---> fitted mass 0
---> fixed current_sfr_timescale 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
---> fitted specific_sfr 1
---> fitted formation_redshift 1
n_fitted: 6
fileName: /opt/BEAGLE/data/spectra/example_spec_0.fits
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 90
dimensionality = 6
*****************************************************
Starting MultiNest
generating live points
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
live points generated, starting sampling
Acceptance Rate: 0.939597
Replacements: 140
Total Samples: 149
Nested Sampling ln(Z): **************
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7F1C1E344E08
#1 0x7F1C1E343F90
#2 0x7F1C1DA7649F
#3 0x7F1C8410A562
#4 0x7F1C8410AB44
#5 0x7F1C83FBDF0B
#6 0x7F1C8443C249
#7 0x7F1C8445CB0E
#8 0x7F1C844608DE
#9 0x7F1C84463627
#10 0x7F1C8448F475
#11 0x7775D8 in __utils1_MOD_diagonalize at utils1.f90:58 (discriminator 44)
#12 0x778669 in __utils1_MOD_calcellprop at utils1.f90:359
#13 0x74590D in __xmeans_clstr_MOD_dinosaur at xmeans_clstr.f90:1988
#14 0x721EDE in __nested_MOD_clusterednest at nested.F90:1451
#15 0x73040E in __nested_MOD_nestsample at nested.F90:365
#16 0x7312C4 in __nested_MOD_nestrun at nested.F90:239
#17 0x6FB8EC in __nested_sampling_MOD_run_nested_sampling at nested_sampling.f90:134
#18 0x6F8522 in __prosit_MOD_sample_pdf at PROSIT.f90:314
#19 0x406F8F in MAIN__ at BEAGLE.f90:465
and fit_photometry_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test 1 /opt/BEAGLE/params/fit_photometry_example.param
All the templates read.
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
---> fixed sfh_type 0
---> fitted mass 0
---> fitted redshift 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
n_fitted: 5
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 150
dimensionality = 5
*****************************************************
Starting MultiNest
generating live points
--- LINEAR: X0 = 2.293E+02 is outside X range --- 9.000E+02 6.010E+04 1666
--- Error reported only once. It may occur more than once. ---
live points generated, starting sampling
Acceptance Rate: 0.970874
Replacements: 200
Total Samples: 206
Nested Sampling ln(Z): -232.109725
Acceptance Rate: 0.871080
Replacements: 250
Total Samples: 287
Nested Sampling ln(Z): -230.201445
Acceptance Rate: 0.735294
Replacements: 300
Total Samples: 408
Nested Sampling ln(Z): -229.678528
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7F252FEA1E08
#1 0x7F252FEA0F90
#2 0x7F252F5D349F
#3 0x7F2595C67562
#4 0x7F2595C67B44
#5 0x7F2595B1AF0B
#6 0x7F2595F99249
#7 0x7F2595FB9B0E
#8 0x7F2595FBD8DE
#9 0x7F2595FC0627
#10 0x7F2595FEC475
#11 0x7775D8 in __utils1_MOD_diagonalize at utils1.f90:58 (discriminator 44)
#12 0x778669 in __utils1_MOD_calcellprop at utils1.f90:359
#13 0x74590D in __xmeans_clstr_MOD_dinosaur at xmeans_clstr.f90:1988
#14 0x721EDE in __nested_MOD_clusterednest at nested.F90:1451
#15 0x73040E in __nested_MOD_nestsample at nested.F90:365
#16 0x7312C4 in __nested_MOD_nestrun at nested.F90:239
#17 0x6FB8EC in __nested_sampling_MOD_run_nested_sampling at nested_sampling.f90:134
#18 0x6F8522 in __prosit_MOD_sample_pdf at PROSIT.f90:314
#19 0x406F8F in MAIN__ at BEAGLE.f90:465
@eclake @adelave1 which are the differences in your system configuration? You run the same OS X, and emma couldn't reproduce the error using the same virtual box and docker versions? At this point I would rather look into Docker / Virtual Box...
For instance, did you, Alex, double check the virtual memory? Can you, Emma, compare your virtual box settings with Alex's ones?
Docker platform doesn't use virtualbox directly - one has to increase the memory through the docker icon and not through virtualbox. My settings are currently:
The docker platform is in beta, so it might be worth trying with Docker toolbox instead, although I'm pretty sure the original issue posted in this thread was also posted from docker toolbox.
If that doesn't work then I have used this before to allow someone to install software for me remotely before going observing. @adelave1 - if Docker Toolbox didn't work is that something that you might be able to install for me to diagnose the problem given that I can't reproduce it? (I just used the free trial and then uninstalled it as soon as the software installation was successful) If your laptop is an institute laptop then that probably wouldn't be an option but maybe @hrbeagle would let me take a look at her laptop when I'm back in Paris?
@eclake, here are my results using Docker toolbox for fit_spectrum_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test 1 /opt/BEAGLE/params/fit_spectrum_example.param
All the templates read.
---> fixed sfh_type 0
---> fitted mass 0
---> fixed current_sfr_timescale 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
---> fitted specific_sfr 1
---> fitted formation_redshift 1
n_fitted: 6
fileName: /opt/BEAGLE/data/spectra/example_spec_0.fits
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 90
dimensionality = 6
*****************************************************
Starting MultiNest
generating live points
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
live points generated, starting sampling
Acceptance Rate: 0.939597
Replacements: 140
Total Samples: 149
Nested Sampling ln(Z): **************
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7f72f410ccc2
#1 0x7f72f410bf90
#2 0x7f72f383e49f
#3 0x7f7359ed2562
#4 0x7f7359ed2b44
#5 0x7f7359d85f0b
#6 0x7f735a204249
#7 0x7f735a224b0e
#8 0x7f735a2288de
#9 0x7f735a22b627
#10 0x7f735a257475
#11 0x7775d8
#12 0x778669
#13 0x74590d
#14 0x721ede
#15 0x73040e
#16 0x7312c4
#17 0x6fb8ec
#18 0x6f8522
#19 0x406f8f
#20 0x4055ec
#21 0x7f72f382982f
#22 0x405628
#23 0xffffffffffffffff
and for fit_photometry_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test 1 /opt/BEAGLE/params/fit_photometry_example.param
All the templates read.
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
---> fixed sfh_type 0
---> fitted mass 0
---> fitted redshift 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
n_fitted: 5
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 150
dimensionality = 5
*****************************************************
Starting MultiNest
generating live points
--- LINEAR: X0 = 2.293E+02 is outside X range --- 9.000E+02 6.010E+04 1666
--- Error reported only once. It may occur more than once. ---
live points generated, starting sampling
Acceptance Rate: 0.970874
Replacements: 200
Total Samples: 206
Nested Sampling ln(Z): -232.109725
Acceptance Rate: 0.871080
Replacements: 250
Total Samples: 287
Nested Sampling ln(Z): -230.201445
Acceptance Rate: 0.735294
Replacements: 300
Total Samples: 408
Nested Sampling ln(Z): -229.678528
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7F55D1922E08
#1 0x7F55D1921F90
#2 0x7F55D105449F
#3 0x7F56376E8562
#4 0x7F56376E8B44
#5 0x7F563759BF0B
#6 0x7F5637A1A249
#7 0x7F5637A3AB0E
#8 0x7F5637A3E8DE
#9 0x7F5637A41627
#10 0x7F5637A6D475
#11 0x7775D8 in __utils1_MOD_diagonalize at utils1.f90:58 (discriminator 44)
#12 0x778669 in __utils1_MOD_calcellprop at utils1.f90:359
#13 0x74590D in __xmeans_clstr_MOD_dinosaur at xmeans_clstr.f90:1988
#14 0x721EDE in __nested_MOD_clusterednest at nested.F90:1451
#15 0x73040E in __nested_MOD_nestsample at nested.F90:365
#16 0x7312C4 in __nested_MOD_nestrun at nested.F90:239
#17 0x6FB8EC in __nested_sampling_MOD_run_nested_sampling at nested_sampling.f90:134
#18 0x6F8522 in __prosit_MOD_sample_pdf at PROSIT.f90:314
#19 0x406F8F in MAIN__ at BEAGLE.f90:465
I'm also running Docker beta using 1 CPU and 8GB of memory. I've downloaded GoToMyPC if you'd like to diagnose the issue on my laptop. Would you need my GoToMyPC password? You can email me at adelave1@jhu.edu if necessary.
@jacopo-chevallard Still having problems; I reinstalled the docker platform, virtualbox, etc, but I still think the problem is in actually pulling the code.
Thank you @adelave1 - I'll be in touch to organise!
current progress on solving this issue (thanks so much for letting me use your laptop @adelave1 ):
My installation of LAPACK in the docker image did not include the BLAS libraries - I think that wasn't causing a compilation error because I had only removed libopenblas.a from /BEAGLE_install/lib but not the shared libraries.
For information @jacopo-chevallard - it's not simple to use homebrew on a docker image - it requires installing homebrew which adds to the bloat of the image so apt-get install or git clone and compile from source are the best ways to go for any suggested installations. Maybe your homebrew LAPACK installation contained the BLAS libraries? I ended up researching how to install using apt-get install but these didn't come with them.
I've found an install that includes the BLAS libraries that I hope will be architecture-independent (still to be tested). Encountering some linking issues at the moment, but can work on solving them on my own machine with @jacopo-chevallard's help and make a new test image for @adelave1 to try.
@adelave1 - please could you try pulling and running eclake/beagle:0.5.7_test2 - if that one works then I'll make a v0.5.8 image that will work for you too!
@eclake - sorry for the delay! I pulled eclake/beagle:0.5.7_test2 and achieved the same error as before with fit_spectrum_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test2 1 /opt/BEAGLE/params/fit_spectrum_example.param
All the templates read.
---> fixed sfh_type 0
---> fitted mass 0
---> fixed current_sfr_timescale 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
---> fitted specific_sfr 1
---> fitted formation_redshift 1
n_fitted: 6
fileName: /opt/BEAGLE/data/spectra/example_spec_0.fits
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 90
dimensionality = 6
*****************************************************
Starting MultiNest
generating live points
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
live points generated, starting sampling
Acceptance Rate: 0.939597
Replacements: 140
Total Samples: 149
Nested Sampling ln(Z): **************
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7FCB12707E08
#1 0x7FCB12706F90
#2 0x7FCB11E3949F
#3 0x7FCB784CD562
#4 0x7FCB784CDB44
#5 0x7FCB78380F0B
#6 0x7FCB787FF249
#7 0x7FCB7881FB0E
#8 0x7FCB788238DE
#9 0x7FCB78826627
#10 0x7FCB78852475
#11 0x7775D8 in __utils1_MOD_diagonalize at utils1.f90:58 (discriminator 44)
#12 0x778669 in __utils1_MOD_calcellprop at utils1.f90:359
#13 0x74590D in __xmeans_clstr_MOD_dinosaur at xmeans_clstr.f90:1988
#14 0x721EDE in __nested_MOD_clusterednest at nested.F90:1451
#15 0x73040E in __nested_MOD_nestsample at nested.F90:365
#16 0x7312C4 in __nested_MOD_nestrun at nested.F90:239
#17 0x6FB8EC in __nested_sampling_MOD_run_nested_sampling at nested_sampling.f90:134
#18 0x6F8522 in __prosit_MOD_sample_pdf at PROSIT.f90:314
#19 0x406F8F in MAIN__ at BEAGLE.f90:465
and for fit_photometry_example.param
:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list eclake/beagle:0.5.7_test2 1 /opt/BEAGLE/params/fit_photometry_example.param
All the templates read.
Reading Filter File: /opt/BEAGLE/build/FILTERBIN.RES
271 filters defined, out of 500 maximum ...done
---> fixed sfh_type 0
---> fitted mass 0
---> fitted redshift 0
---> fixed attenuation_type 0
---> fitted tauV_eff 0
---> fixed mu 0
---> fitted tau 1
---> fitted metallicity 1
n_fitted: 5
*****************************************************
MultiNest v3.9
Copyright Farhan Feroz & Mike Hobson
Release Oct 2014
no. of live points = 150
dimensionality = 5
*****************************************************
Starting MultiNest
generating live points
--- LINEAR: X0 = 2.293E+02 is outside X range --- 9.000E+02 6.010E+04 1666
--- Error reported only once. It may occur more than once. ---
live points generated, starting sampling
Acceptance Rate: 0.970874
Replacements: 200
Total Samples: 206
Nested Sampling ln(Z): -232.109725
Acceptance Rate: 0.871080
Replacements: 250
Total Samples: 287
Nested Sampling ln(Z): -230.201445
Acceptance Rate: 0.735294
Replacements: 300
Total Samples: 408
Nested Sampling ln(Z): -229.678528
Program received signal SIGILL: Illegal instruction.
Backtrace for this error:
#0 0x7FF526172E08
#1 0x7FF526171F90
#2 0x7FF5258A449F
#3 0x7FF58BF38562
#4 0x7FF58BF38B44
#5 0x7FF58BDEBF0B
#6 0x7FF58C26A249
#7 0x7FF58C28AB0E
#8 0x7FF58C28E8DE
#9 0x7FF58C291627
#10 0x7FF58C2BD475
#11 0x7775D8 in __utils1_MOD_diagonalize at utils1.f90:58 (discriminator 44)
#12 0x778669 in __utils1_MOD_calcellprop at utils1.f90:359
#13 0x74590D in __xmeans_clstr_MOD_dinosaur at xmeans_clstr.f90:1988
#14 0x721EDE in __nested_MOD_clusterednest at nested.F90:1451
#15 0x73040E in __nested_MOD_nestsample at nested.F90:365
#16 0x7312C4 in __nested_MOD_nestrun at nested.F90:239
#17 0x6FB8EC in __nested_sampling_MOD_run_nested_sampling at nested_sampling.f90:134
#18 0x6F8522 in __prosit_MOD_sample_pdf at PROSIT.f90:314
#19 0x406F8F in MAIN__ at BEAGLE.f90:465
too bad... I'll prepare a test of MultiNest alone, outside Beagle, so we can try to understand the origin of the problem in a simpler way...
@alex-delavega can you (re)try by replacing eclake/beagle:<tag>
with beagletool/beagle:<tag>
, where <tag>
is 0.7.1
, i.e. you have to pull
docker pull beagletool/beagle:0.7.1
@jacopo-chevallard, I tried your suggestion and received the following error:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list beagletool/beagle:0.7.1 1 /opt/BEAGLE/params/fit_spectrum_example.param
BEAGLE: error: switch "1" is unknown!
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[ERROR: Error during parsing of command line arguments! ]
[FUNCTION: "main" ]
[MODULE: "BEAGLE" ]
[BACKTRACE:
#0 0x7F3B23433E08
#1 0x6319FC in print_backtrace at lib_messages.f90:330
#2 0x407474 in MAIN__ at BEAGLE.f90:213
]
[ ***************************************** ]
[ ----> EXITING FUNCTION / SUBROUTINE <---- ]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[ ***** STOPPING PROGRAM ***** ]
Oups, forgot to mention that we slightly changed the command line interface... you should try with
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list beagletool/beagle:0.7.1 --parameter-file /opt/BEAGLE/params/fit_spectrum_example.param --fit
@jacopo-chevallard, I still seem to be having issues:
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list beagletool/beagle:0.7.1 --parameter-file /opt/BEAGLE/params/fit_spectrum_example.param --fit
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[ERROR: Argument # 1 (i.e. program type) missing or wrong. ]
[FUNCTION: "BEAGLE" ]
[BACKTRACE:
#0 0x7F84412DCE08
#1 0x6319FC in print_backtrace at lib_messages.f90:330
#2 0x407705 in MAIN__ at BEAGLE.f90:260
]
[ ***************************************** ]
[ ----> EXITING FUNCTION / SUBROUTINE <---- ]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[ ***** STOPPING PROGRAM ***** ]
Again, my fault, I had forgotten to remove the old command line interface from the program...
You should try with version 0.7.2
, i.e. pulling
docker pull beagletool/beagle:0.7.2
@jacopo-chevallard, success! Worked for both fit_spectrum_example.param
and fit_photometry_example.param
!
Fantastic !! Ideally, @hrbeagle will also test the new Beagle-Docker by pulling
docker pull beagletool/beagle:0.7.2
and running
docker run --rm -it -v /Users/vega/Desktop/BEAGLE/BEAGLE-general-master:/opt/BEAGLE --env-file env.list beagletool/beagle:0.7.1 --parameter-file /opt/BEAGLE/params/fit_spectrum_example.param --fit
(changing the path where appropriate)
closing this, no new errors of this type appeared
Here is the full screen output: