Closed vivek-bala closed 7 years ago
There is no error reported during the above compilation sequence.
Hi, @wjlei1990 Can you post the latest configuration script you are using for titan in a gist and link it in this issue? Thanks
@vivek-bala
There is no error reported during the above compilation sequence.
What do you mean? Is it working now?
What you encounter is a result of specfem3d_globe having some parameters "hard-compiled'. The idea behind that is to "speed things up" by allowing the compiler to trim the execution path.
When you compile, it runs an executable called ./bin/xcreate_header_file
that generates a file called OUTPUT_FILES/values_from_mesher.h
with, for instance logical, parameter :: TRANSVERSE_ISOTROPY_VAL = .false."
in it. This file is then included in the list of source files required to compiled the mesher and the solver.
TRANSVERSE_ISOTROPY_VAL
is set up during the header creation and depends on the value of MODEL = [name your model]
in DATA/Par_file
.
If you decide to change the model, after the compilation, you might end up having conflicts between the read and the saved value.
In short, the solution is: Although some parameters in the Par_file can be changed (e.g. RECORD_LENGTH_IN_MINUTES), avoid changing anything in the Par_file after the compilation.
If you need to download / copy specfem to various locations, you want to ensure that
DATA/Par_file
and setup/constants.h(.in)
are the same.
Hey @mpbl , apologize for the delay.
What do you mean? Is it working now?
I meant that although there is no error during the compilation stage, there seems to be an error during execution which suggests it was not compiled correctly.
In short, the solution is: Although some parameters in the Par_file can be changed (e.g. RECORD_LENGTH_IN_MINUTES), avoid changing anything in the Par_file after the compilation.
I see what you mean. I didn't make any changes to the Par_file when I tried to execute.
Do you see any errors with the compilation process?
Hey @mpbl , @wjlei1990 , Please share the compilation instructions for specfem with openmpi when you get a chance.
One moment, I will provide it tonight.
I have permission issue on this directory:
[lei@rhea-login3g ~]$ ls -alh /lustre/atlas/scratch/vivekb/bip149/radical.pilot.sandbox/rp.session.titan-ext5.vivekb.017261.0014/pilot.0000/unit.000001
ls: cannot access /lustre/atlas/scratch/vivekb/bip149/radical.pilot.sandbox/rp.session.titan-ext5.vivekb.017261.0014/pilot.0000/unit.000001: Permission denied
Can you try again please? I think it should be fixed now.
It seems I still don't have access...
Maybe you can put it on the proj-shared directory? I think I just need to see if there are some errors in your files...
Oops. Ok, I put the folder at /lustre/atlas/world-shared/csc230/rp.session.titan-ext5.vivekb.017261.0014
since it is world shared. Hopefully that works. Permissions look ok.
Thanks. I think now I have access to the directory.
I took a look at some sub-dirs, but most of them are empty. Could you point me to the path to your "Par_file"?
Yes, since one of them failed then entire process is shutdown. I have put the scripts at "/lustre/atlas/world-shared/csc230/fwd_sims/" as well. You can find the Par_file at "/lustre/atlas/world-shared/csc230/fwd_sims/input_data/DATA".
I have prepared a example(compile specfem3d_globe using only the CPU) at:
/lustre/atlas/world-shared/geo111/wenjie/specfem3d_globe
Please look at the file configure.titan.sh
and compile.titan.sh
to see how I configure and compile the code. I have run both the mesher and solver succuessfully on Titan using CPU. However, this version is based on MPICH
.
I have another example using our local cluster, which uses openmpi
instead. It uses the configure command you listed above:
./configure FC=mpif90 CC=mpicc MPIFC=mpif90
It uses openmpi/gcc/1.8.8/64
and run successfully both the mesher and solver.
Hi @wjlei1990 , I don't have access to all the files (e.g. DATA/Par_file, etc.). Could you give me permissions to all the files in /lustre/atlas/world-shared/geo111/wenjie/specfem3d_globe
please?
config.status: error: cannot find input file: `DATA/Par_file'
Hi Vivek, I changed the permission. Let me know if it works for you.
Thanks everyone for the help.
We have a working example that uses RP to submit meshfem and specfem tasks that use CPUs on Titan. Instructions are available here if anyone wants to give it a try.
This example uses the meshfem and specfem binaries built against the Titan MPI modules and not the RP openmpi. The plan is to use RP in the aprun mode as it suffices for the simulation stages to be run on Titan. This avoids the restriction of having to compile all tools against RP openmpi.
The plan is to use the same mode for the experiments.
When trying to use specfem3D binary, I encounter the following error:
Can be recreated by running the
radical_pilot_cu_launch_script.sh
at/lustre/atlas/scratch/vivekb/bip149/radical.pilot.sandbox/rp.session.titan-ext5.vivekb.017261.0014/pilot.0000/unit.000001
. Please be sure to be on the compute node (you can use the interactive jobs to do so. I think I have set global permission for all the files, let me know if you face permission issues.Currently, the following is the sequence of commands used to compile specfem on titan.
also tried the following