HomerReid / scuff-em

A comprehensive and full-featured computational physics suite for boundary-element analysis of electromagnetic scattering, fluctuation-induced phenomena (Casimir forces and radiative heat transfer), nanophotonics, RF device engineering, electrostatics, and more. Includes a core library with C++ and python APIs as well as many command-line applications.
http://www.homerreid.com/scuff-em
GNU General Public License v2.0
125 stars 50 forks source link

[scuff-neq] error: could not open file SiO2.dat (aborting) #169

Open omerchiers opened 6 years ago

omerchiers commented 6 years ago

Hello,

I hope this is the right place to post this. This could also be a julia related issue. We are trying to run scuff-neq from inside julia. The reason we want this, is that it allows us to easily generate parallel jobs for multiple frequencies. Maybe this feature is included in the scuff suite, but we haven't found where this is described.

This strategy works fine when the material properties are defined through a dispersion function. A problem arises, however, when we try to read properties from a file containing frequencies and the values of the complex permittivity.

here is the julia function

function scuff_job(frequency :: Float64, geometry_file :: String ,  output_file :: String ) 
     freq = string(frequency) 
     output = output_file*"freq="*freq*"_radHz"
     run(`scuff-neq --Geometry $geometry_file --Omega $frequency --EMTPFT --FileBase $output`)
end

It simply wraps the scuff-neq program. This is the contents of the geometry_file:

OBJECT  Plan
    MESHFILE /home/path_to_files/Cylinder_mesh2faces_diffv1_R10.msh
    MATERIAL FILE_SiO2.dat
ENDOBJECT

OBJECT  Sphere
    MESHFILE /home/path_to_files/Sphere.msh
    MATERIAL FILE_SiO2.dat
    DISPLACED 0 0 0.9
ENDOBJECT

When running from bash this works fine and scuff-neq finds SiO2.dat correctly. When running from the julia prompt, however, the following error is thrown

julia> scuff_job(freqfile, geofile ,  "test_par" )
error: could not open file SiO2.dat (aborting)
ERROR: failed process: Process(`scuff-neq --Geometry /home/omerchiers/Documents/Travail/02-Recherche/Travaux/Projets_en_cours/Near_Field_Radiative_Heat_Transfer/05-DEMO-NFR-TPV/Calculs/Fichiers_scuff-em/data/Sphere_Plan.scuffgeo --OmegaFile /home/omerchiers/Documents/Travail/02-Recherche/Travaux/Projets_en_cours/Near_Field_Radiative_Heat_Transfer/05-DEMO-NFR-TPV/Calculs/Fichiers_scuff-em/data/OmegaFile1.txt --EMTPFT --FileBase 'test_parfreq=1.2557677e-002-  3.5155867e-001'`, ProcessExited(1)) [1]
Stacktrace:
 [1] pipeline_error(::Base.Process) at ./process.jl:682
 [2] run(::Cmd) at ./process.jl:651
 [3] scuff_job(::String, ::String, ::String) at /home/omerchiers/Documents/Travail/02-Recherche/Travaux/Projets_en_cours/Near_Field_Radiative_Heat_Transfer/05-DEMO-NFR-TPV/Calculs/Fichiers_scuff-em/src/ScuffTools/run/run_on_cluster.jl:39

the first line of the error message error: could not open file SiO2.dat (aborting) seems to come from scuff-neq. This message seems quite surprising, since scuff-neq finds the file but seems unable to open it.

Many thanks in advance, Olivier

omerchiers commented 6 years ago

The problem is solved when we run the function from inside the folder where SiO2.dat is kept. But the same problem resurfaces when we try to run the parallel version of the script:

function scuff_parallel(frequencies :: AbstractString , geometry_file :: AbstractString , output_file :: AbstractString)
    freqv = readdlm(frequencies)
    scuff_par(frequency) = scuff_job(frequency, geometry_file, output_file)
    pmap(scuff_par, freqv)
end

This gives the following error:

julia> scuff_parallel(freqfile, geofile ,  "test_par" )
error: could not open file SiO2.dat (aborting)
error: could not open file SiO2.dat (aborting)
error: could not open file SiO2.dat (aborting)
ERROR: On worker 2:
failed process: Process(`scuff-neq --Geometry /home/omerchiers/Documents/Travail/02-Recherche/Travaux/Projets_en_cours/Near_Field_Radiative_Heat_Transfer/05-DEMO-NFR-TPV/Calculs/Fichiers_scuff-em/data/Sphere_Plan.scuffgeo --Omega 0.012557677 --EMTPFT --FileBase test_parfreq=0.012557677_radHz`, ProcessExited(1)) [1]
pipeline_error at ./process.jl:682

The scuffgeo file is still the same as the one given in the first post. The first thrown error seems to be again a scuff-neq error. Somehow the other workers cannot access the file SiO2.dat.

Many thanks in advance Olivier

HomerReid commented 6 years ago

Could it be that your julia parallelization wrapper has the effect of running the code in a temporary working directory, in which it can't find the data file? Try specifying an absolute path to the data file, i.e.

 MATERIAL FILE_/home/username/matfiles/SiO2.dat

Also, are you sure that doing the parallelization this way is optimal? For multiple-frequency calculations on a single workstation I have always found it best to do one frequency at a time, using all available CPU cores for that frequency. It's hard to believe that running multiple frequencies simultaneously with one core per frequency could possibly be faster. (Among other things, you won't get the benefit of cache reusage on the second and subsequent frequencies.)

Could it be that your compilation or run-time environment are not optimized to use all CPU cores? You can test this by following the procedure outlined here.

omerchiers commented 6 years ago

Hello, Sorry for my late reply. I have been away for some time. Thanks a lot for looking into this.

Could it be that your julia parallelization wrapper has the effect of running the code in a temporary working directory, in which it can't find the data file? Try specifying an absolute path to the data file, i.e.

MATERIAL FILE_/home/username/matfiles/SiO2.dat

That's what I thought also, but changing the name the way you suggest does not solve the problem.

As for the parallelization, thanks a lot for pointing this out. On our machine we have two CPU's composed of 16 cores each. I would like to at least be able to use both CPU's. But maybe pmap is not the appropriate tool for that. I will go to the julia forum to ask for some help on this aspect I have to admit I have very little experience with parallel computations.

Could it be that your compilation or run-time environment are not optimized to use all CPU cores? You can test this by following the procedure outlined here.

Ok, I will run these tests. Many thanks again. Best, Olivier