Closed wilsonmr closed 6 years ago
another example of the message (both from odeintns.f) is
Allowed evolution range [ 1.6500 : 100000.0000 ] GeV
Initialization of the evolution completed in 923.826 s
In odeintns.f:
too many steps!
How is this installed? conda?
Yes, same error both with conda packages and also a dev environment as a cross check
I am going to bet this has something to do with the stack getting corrupted as in https://github.com/NNPDF/apfelcomb/issues/9.
Same kind of errors for me from a different account on the cluster
@wilsonmr @tgiani which theoryID are you running?
we are using theory 53
Thanks, I have just tested and on my machines I don't see this issue at all...
Hi Stefano, Yes @nhartland also couldn't seem to reproduce this error, it seems to be specific to the cluster at Edinburgh.
We started getting this issue only recently after struggling to get conda to install properly on the cluster. which I eventually worked around by installing a slightly older version of conda (see #132)
Note that conda does't do anything but compile the thing and package it. This is very unlikely to depend on the version of the conda installer. Additionally stack related errors can innocently write on padded areas and not trigger any signal in some code paths (and we in fact have evidence that this seems to work most of the time). This is where things like ASAN or valgrind come handy.
I suggested starting with the apfelcomb issue because we know how to reproduce it reliably.
I think I have identified the bug you are getting at the end of the fit. It is related to the usual string global-overflow-leak and a possible solution is in https://github.com/scarrazza/apfel/pull/10.
@wilsonmr @tgiani could you please install this APFEL branch https://github.com/scarrazza/apfel/pull/9, and test again nnfit (master)? If you prefer you don't need to run the fit in your cluster, just take a runcard reduce the ngen to something like 10-100 and start nnfit for replica 1, this should take ~1h to finish.
Set fit running, will get back to you within an hour or so with results!
Actually I got some core dumps in my home directory, and despite removing them, quota
isn't updating and so I'm being told I've exceeded my disk space and so I can't do anything in my home directory, it might take a little while longer to get this running unless @tgiani has more luck
yep I m going to have a try later today
Thanks, let me know.
I have some problems as well, sorry..working on that
Actually I don 't know if I m doing it right..I 'm doing the conda development installation 1) conda create -n test gxx_linux-64 2) source activate test 3) conda install validphys nnpdf 4) conda remove validphys --force 5) conda remove libnnpdf --force 6) conda remove nnpdf --force 7) conda install pkg-config swig=3.0.10 cmake (other dependencies which do not $ 8) cd nnpdf 9) mkdir conda-bld, cd conda-bld 10) cmake .. -DCMAKE_INSTALL_PREFIX=path/to/anaconda/envs/vp2dev/
with the only difference that after point 6) I m also doing
conda remove apfel --force
and then I m installing apfel from source,using the branch fixstringleak. Is this correct? Trying to compile the nnpdf code I m getting
(test) [s1792848@login04(eddie) conda-bld]$ make
[ 43%] Built target nnpdf
[ 47%] Built target FKconvolute
[ 52%] Built target FKmerge2
[ 54%] Built target gen_nnpdf_nnpdfPYTHON_wrap
[ 58%] Built target _nnpdf
[ 69%] Built target common
[ 76%] Built target filter
[ 78%] Linking CXX executable ../../binaries/nnfit
/exports/csce/eddie/ph/groups/rbm_ml/tommaso/myconda/envs/apfelbug/lib/libAPFEL.so: undefined reference to `memcpy@GLIBC_2.14'
collect2: error: ld returned 1 exit status
make[2]: *** [binaries/nnfit] Error 1
make[1]: *** [nnpdfcpp/src/CMakeFiles/nnfit.dir/all] Error 2
make: *** [all] Error 2
if you're compiling apfel
with a conda development environment you also need to get the package gfortran_linux-64
otherwise it will use the default one installed with linux which I think is why it can't find the library
great thanks, I ll try again with that
I m getting the same problems Micheal described above with quota, already when I m trying to install the code..so basically I cannot do anything in my home directory on the cluster, and also the conda installation looks broken..I m trying to solve the problem but it could require more time. If usefull I can first test the code locally and not on the cluster, but the problems with apfel displayed only on the cluster..
Hi Stefano, I have ran a fit with the new branch. I appear to be getting one fo the errors I was getting before. Is this string related? Looks like an ODE/integration routine that isn't finishing
In odeintns.f:
too many steps!
Ok, so this is not related to the leak. Could you please send me by mail your runcard?
Yeah sure, actually it's just the fit I asked you to ran at CERN however with the ngen
turned down. I'll send it over
Just to cross check again, are you using the master of nnpdf?
yes this was using the fixstringleak branch of apfel and master branch of nnpdf
Could you please run the Tabulation example inside apfel/examples
and check if you get the same too many steps message?
I have tested the runcard and I am not able to reproduce this problem.
This sounds like some numerical issue (due to compiler or architecture or something else) in your cluster. so a possible way to sort this out is to open APFEL, print the stored variables produced in your clusters and compare to the output in another machine where this does not happen.
When you run this code in the cluster, is it crashing on the cluster slaves or this happens in the master node too (where you install/setup the code)?
As the point above can be tricky/painful to perform, would be nice if I could access your cluster, do you think this is possible?
I'll try the Tabulation example now. It's crashing in the cluster slaves.. it won't let me run it on the master node, saying it can't allocate the memory.
Is running Tabulation example as simple as this?
cd apfel/examples
./Tabulation
in which case I get the output
INFO: activate-gcc_linux-64.sh made the following environmental changes:
+CC=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-cc
+CFLAGS=-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe
+_CONDA_PYTHON_SYSCONFIGDATA_NAME=_sysconfigdata_x86_64_conda_cos6_linux_gnu
+CPP=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-cpp
+CPPFLAGS=-DNDEBUG -D_FORTIFY_SOURCE=2 -O2
+DEBUG_CFLAGS=-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -pipe
+DEBUG_CPPFLAGS=-D_DEBUG -D_FORTIFY_SOURCE=2 -Og
+GCC_AR=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gcc-ar
+GCC=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gcc
+GCC_NM=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gcc-nm
+GCC_RANLIB=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gcc-ranlib
+HOST=x86_64-conda_cos6-linux-gnu
+LDFLAGS=-Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now
INFO: activate-binutils_linux-64.sh made the following environmental changes:
+ADDR2LINE=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-addr2line
+AR=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-ar
+AS=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-as
+CXXFILT=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-c++filt
+ELFEDIT=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-elfedit
+GPROF=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gprof
+LD=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-ld
+LD_GOLD=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-ld.gold
+NM=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-nm
+OBJCOPY=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-objcopy
+OBJDUMP=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-objdump
+RANLIB=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-ranlib
+READELF=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-readelf
+SIZE=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-size
+STRINGS=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-strings
+STRIP=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-strip
INFO: activate-gxx_linux-64.sh made the following environmental changes:
+CXX=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-c++
+CXXFLAGS=-fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe
+DEBUG_CXXFLAGS=-fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -pipe
+GXX=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-g++
INFO: activate-gfortran_linux-64.sh made the following environmental changes:
+DEBUG_FFLAGS=-fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fcheck=all -fbacktrace -fimplicit-none -fvar-tracking-assignments -pipe
+DEBUG_FORTRANFLAGS=-fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fcheck=all -fbacktrace -fimplicit-none -fvar-tracking-assignments -pipe
+F77=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gfortran
+F95=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-f95
+FC=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gfortran
+FFLAGS=-fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe
+FORTRANFLAGS=-fopenmp -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe
+GFORTRAN=/exports/csce/eddie/ph/groups/rbm_ml/michael/myconda/envs/apfeltest/bin/x86_64-conda_cos6-linux-gnu-gfortran
At line 60 of file Tabulation.f (unit = 5, file = 'stdin')
Fortran runtime error: End of file
Error termination. Backtrace:
#0 0x2b5c4ad45649 in list_formatted_read_scalar
at /opt/conda/conda-bld/compilers_linux-64_1520532893746/work/.build/src/gcc-7.2.0/libgfortran/io/list_read.c:2306
#1 0x2b5b9121cfc9 in ???
#2 0x2b5b9121cdc0 in ???
#3 0x2b5c4b48dc04 in ???
#4 0x2b5b9121cdf0 in ???
Welcome to
_/_/_/ _/_/_/_/ _/_/_/_/ _/_/_/_/ _/
_/ _/ _/ _/ _/ _/ _/
_/_/_/_/ _/_/_/_/ _/_/_/ _/_/_/ _/
_/ _/ _/ _/ _/ _/
_/ _/ _/ _/ _/_/_/_/ _/_/_/_/
_____v3.0.2 A PDF Evolution Library, arXiv:1310.1394
Authors: V. Bertone, S. Carrazza, J. Rojo
Report of the evolution parameters:
QCD evolution
Space-like evolution (PDFs)
Unpolarized evolution
Evolution scheme: VFNS at N2LO
Solution of the DGLAP equation: 'exactalpha' with maximum 6 active flavours
Solution of the coupling equations: 'exact' with maximum 6 active flavours
Coupling reference value:
- AlphaQCD( 1.4142 GeV) = 0.350000
Pole heavy quark masses:
- Mc = 1.4142 GeV
- Mb = 4.5000 GeV
- Mt = 175.0000 GeV
The matching thresholds coincide with the physical masses
muR / muF = 1.0000
Allowed evolution range [ 1.0000 : 10000.0000 ] GeV
The internal subgrids will be locked
Fast evolution enabled
Initialization of the evolution completed in 4.298 s
Enter initial and final scale in GeV^2
Yes, thanks. What is the available memory per slave? Can you set to 4Gb?
I actually requested 8Gb for that particular job
Perhaps I'm wrong, because the language they use is slightly different, but I don't think there is a flat answer to that, can you see this link? https://www.wiki.ed.ac.uk/display/ResearchServices/Memory+Specification If not it says that the cluster is comprised of a variety of nodes each with different numbers of cores/memory available, just from scanning the list the two most common setups are 16 cores 64Gb RAM and 16 cores 128Gb RAM... Does that help at all?
This looks like a genuine error. Could you try compiling apfel with export FFLAGS=DEBUG_FFLAGS
so we can see the apfel part of the backtrace?
I think the error with the integration earlier may be because you have to too few iterations, leading to non smmoth nucleons. But there seems to be enough issues that this needs yet some more debugging.
urm ok I think I did what you said? :
(apfeltest) [s1758208@login04(eddie) apfel]$ make
Making all in include
make[1]: Entering directory `/exports/eddie/scratch/s1758208/apfel/include'
Making all in APFEL
make[2]: Entering directory `/exports/eddie/scratch/s1758208/apfel/include/APFEL'
make all-am
make[3]: Entering directory `/exports/eddie/scratch/s1758208/apfel/include/APFEL'
make[3]: Leaving directory `/exports/eddie/scratch/s1758208/apfel/include/APFEL'
make[2]: Leaving directory `/exports/eddie/scratch/s1758208/apfel/include/APFEL'
make[2]: Entering directory `/exports/eddie/scratch/s1758208/apfel/include'
make[2]: Nothing to be done for `all-am'.
make[2]: Leaving directory `/exports/eddie/scratch/s1758208/apfel/include'
make[1]: Leaving directory `/exports/eddie/scratch/s1758208/apfel/include'
Making all in ccwrap
make[1]: Entering directory `/exports/eddie/scratch/s1758208/apfel/ccwrap'
F77 APFELfwevol.lo
APFELfwevol.f:30:14:
fxpdf = xPDF(i,x)
1
Error: Function 'xpdf' at (1) has no IMPLICIT type
APFELfwevol.f:38:16:
fxpdfxq = xPDFxQ(i,x,Q)
1
Error: Function 'xpdfxq' at (1) has no IMPLICIT type
APFELfwevol.f:46:15:
fxpdfj = xPDFj(i,x)
1
Error: Function 'xpdfj' at (1) has no IMPLICIT type
APFELfwevol.f:54:15:
fdxpdf = dxPDF(i,x)
1
Error: Function 'dxpdf' at (1) has no IMPLICIT type
APFELfwevol.f:61:16:
fxgamma = xgamma(x)
1
Error: Function 'xgamma' at (1) has no IMPLICIT type
APFELfwevol.f:68:17:
fxgammaj = xgammaj(x)
1
Error: Function 'xgammaj' at (1) has no IMPLICIT type
APFELfwevol.f:75:17:
fdxgamma = dxgamma(x)
1
Error: Function 'dxgamma' at (1) has no IMPLICIT type
APFELfwevol.f:95:17:
fxlepton = xLepton(i,x)
1
Error: Function 'xlepton' at (1) has no IMPLICIT type
APFELfwevol.f:103:18:
fxleptonj = xLeptonj(i,x)
1
Error: Function 'xleptonj' at (1) has no IMPLICIT type
APFELfwevol.f:154:18:
falphaqcd = AlphaQCD(Q)
1
Error: Function 'alphaqcd' at (1) has no IMPLICIT type
APFELfwevol.f:161:18:
falphaqed = AlphaQED(Q)
1
Error: Function 'alphaqed' at (1) has no IMPLICIT type
APFELfwevol.f:169:14:
fnpdf = NPDF(i,N)
1
Error: Function 'npdf' at (1) has no IMPLICIT type
APFELfwevol.f:177:16:
fngamma = Ngamma(N)
1
Error: Function 'ngamma' at (1) has no IMPLICIT type
APFELfwevol.f:185:14:
flumi = LUMI(i,j,S)
1
Error: Function 'lumi' at (1) has no IMPLICIT type
APFELfwevol.f:193:15:
fxgrid = xGrid(alpha)
1
Error: Function 'xgrid' at (1) has no IMPLICIT type
APFELfwevol.f:198:6:
function fnintervals()
1
Error: Function 'fnintervals' at (1) has no IMPLICIT type
APFELfwevol.f:268:24:
fheavyquarkmass = HeavyQuarkMass(i,Q)
1
Error: Function 'heavyquarkmass' at (1) has no IMPLICIT type
APFELfwevol.f:276:22:
fgetthreshold = GetThreshold(i)
1
Error: Function 'getthreshold' at (1) has no IMPLICIT type
APFELfwevol.f:284:29:
fheavyquarkthreshold = HeavyQuarkThreshold(i)
1
Error: Function 'heavyquarkthreshold' at (1) has no IMPLICIT type
make[1]: *** [APFELfwevol.lo] Error 1
make[1]: Leaving directory `/exports/eddie/scratch/s1758208/apfel/ccwrap'
make: *** [all-recursive] Error 1
If tabulation doesn't work then this has nothing to do about smooth initial pdfs.
I still think there is something funny with the compiler/memory of these machines. I will prepare a custom version of tabulation where we print the most relevant variables, and then we compare the output.
@wilsonmr Maybe remove -fno-implicit.
removed the closest thing to that -fimplicit-none
do you want the full output of compiling? It's very long..
Ideally, we want the ourput of running.
Could you please modify Tabulation.f with the diff above, recompile and run:
diff --git a/examples/Tabulation.f b/examples/Tabulation.f
index 4d3416b..6af7006 100644
--- a/examples/Tabulation.f
+++ b/examples/Tabulation.f
@@ -57,7 +57,9 @@ c call SetMaxFlavourAlpha(5)
* Evolve PDFs on the grids
*
write(6,*) "Enter initial and final scale in GeV^2"
- read(5,*) Q02,Q2
+* read(5,*) Q02,Q2
+ Q02 = 1d0
+ Q2 = 10d0
*
Q0 = dsqrt(Q02) - eps
Q = dsqrt(Q2)
hi, I did what you asked, I forgot to compile with debug flags however I'm guessing there were no errors
Welcome to
_/_/_/ _/_/_/_/ _/_/_/_/ _/_/_/_/ _/
_/ _/ _/ _/ _/ _/ _/
_/_/_/_/ _/_/_/_/ _/_/_/ _/_/_/ _/
_/ _/ _/ _/ _/ _/
_/ _/ _/ _/ _/_/_/_/ _/_/_/_/
_____v3.0.2 A PDF Evolution Library, arXiv:1310.1394
Authors: V. Bertone, S. Carrazza, J. Rojo
Report of the evolution parameters:
QCD evolution
Space-like evolution (PDFs)
Unpolarized evolution
Evolution scheme: VFNS at N2LO
Solution of the DGLAP equation: 'exactalpha' with maximum 6 active flavours
Solution of the coupling equations: 'exact' with maximum 6 active flavours
Coupling reference value:
- AlphaQCD( 1.4142 GeV) = 0.350000
Pole heavy quark masses:
- Mc = 1.4142 GeV
- Mb = 4.5000 GeV
- Mt = 175.0000 GeV
The matching thresholds coincide with the physical masses
muR / muF = 1.0000
Allowed evolution range [ 1.0000 : 10000.0000 ] GeV
The internal subgrids will be locked
Fast evolution enabled
Initialization of the evolution completed in 4.348 s
Enter initial and final scale in GeV^2
alpha_QCD(mu2F) = 0.24423490003380316
alpha_QED(mu2F) = 7.5400351612049015E-003
Standard evolution:
x u-ubar d-dbar 2(ubr+dbr) c+cbar gluon photon e^-+e^+ mu^-+mu^+ tau^-+tau^+
1.0E-05 1.8921E-03 1.2054E-03 1.2870E+01 3.1967E+00 5.2513E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-04 8.6754E-03 5.2500E-03 7.1656E+00 1.6049E+00 3.0403E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-03 4.0463E-02 2.3677E-02 3.7563E+00 7.1481E-01 1.5276E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-02 1.8350E-01 1.0492E-01 1.7845E+00 2.5205E-01 6.0821E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-01 5.8322E-01 2.9874E-01 4.4635E-01 3.5459E-02 1.2055E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
3.0E-01 4.8226E-01 1.8924E-01 5.4302E-02 3.0448E-03 1.6810E-01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
5.0E-01 2.0668E-01 5.7324E-02 4.5800E-03 2.3472E-04 2.0605E-02 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
7.0E-01 4.3973E-02 7.2552E-03 1.3288E-04 7.7072E-06 1.2059E-03 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
9.0E-01 1.1975E-03 6.5373E-05 1.0148E-07 1.3555E-08 5.4731E-06 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
Standard evolution using the xPDFall function:
x u-ubar d-dbar 2(ubr+dbr) c+cbar gluon
1.0E-05 1.8921E-03 1.2054E-03 1.2870E+01 3.1967E+00 5.2513E+01
1.0E-04 8.6754E-03 5.2500E-03 7.1656E+00 1.6049E+00 3.0403E+01
1.0E-03 4.0463E-02 2.3677E-02 3.7563E+00 7.1481E-01 1.5276E+01
1.0E-02 1.8350E-01 1.0492E-01 1.7845E+00 2.5205E-01 6.0821E+00
1.0E-01 5.8322E-01 2.9874E-01 4.4635E-01 3.5459E-02 1.2055E+00
3.0E-01 4.8226E-01 1.8924E-01 5.4302E-02 3.0448E-03 1.6810E-01
5.0E-01 2.0668E-01 5.7324E-02 4.5800E-03 2.3472E-04 2.0605E-02
7.0E-01 4.3973E-02 7.2552E-03 1.3288E-04 7.7072E-06 1.2059E-03
9.0E-01 1.1975E-03 6.5373E-05 1.0148E-07 1.3555E-08 5.4731E-06
PDFs have been cached
Caching completed in 0.104 s
Cached evolution:
x u-ubar d-dbar 2(ubr+dbr) c+cbar gluon photon e^-+e^+ mu^-+mu^+ tau^-+tau^+
1.0E-05 1.8921E-03 1.2054E-03 1.2870E+01 3.1967E+00 5.2513E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-04 8.6754E-03 5.2500E-03 7.1656E+00 1.6049E+00 3.0403E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-03 4.0463E-02 2.3677E-02 3.7563E+00 7.1481E-01 1.5276E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-02 1.8350E-01 1.0492E-01 1.7845E+00 2.5205E-01 6.0821E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
1.0E-01 5.8322E-01 2.9874E-01 4.4635E-01 3.5459E-02 1.2055E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
3.0E-01 4.8226E-01 1.8924E-01 5.4302E-02 3.0448E-03 1.6810E-01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
5.0E-01 2.0668E-01 5.7324E-02 4.5800E-03 2.3472E-04 2.0605E-02 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
7.0E-01 4.3973E-02 7.2552E-03 1.3289E-04 7.7072E-06 1.2059E-03 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
9.0E-01 1.1975E-03 6.5373E-05 1.0137E-07 1.3522E-08 5.4731E-06 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00
Cached evolution using the xPDFxQall function:
x u-ubar d-dbar 2(ubr+dbr) c+cbar gluon
1.0E-05 1.8921E-03 1.2054E-03 1.2870E+01 3.1967E+00 5.2513E+01
1.0E-04 8.6754E-03 5.2500E-03 7.1656E+00 1.6049E+00 3.0403E+01
1.0E-03 4.0463E-02 2.3677E-02 3.7563E+00 7.1481E-01 1.5276E+01
1.0E-02 1.8350E-01 1.0492E-01 1.7845E+00 2.5205E-01 6.0821E+00
1.0E-01 5.8322E-01 2.9874E-01 4.4635E-01 3.5459E-02 1.2055E+00
3.0E-01 4.8226E-01 1.8924E-01 5.4302E-02 3.0448E-03 1.6810E-01
5.0E-01 2.0668E-01 5.7324E-02 4.5800E-03 2.3472E-04 2.0605E-02
7.0E-01 4.3973E-02 7.2552E-03 1.3289E-04 7.7072E-06 1.2059E-03
9.0E-01 1.1975E-03 6.5373E-05 1.0137E-07 1.3522E-08 5.4731E-06
Ok, let's do more tests, could you please modify, recompile and rerun nnfit (with my previous card) after applying the changes below? (in principle this should work because almost identical to tabulation):
diff --git a/nnpdfcpp/src/nnfit/src/apfelevol.cc b/nnpdfcpp/src/nnfit/src/apfelevol.cc
index e1a2cf5..bd617ee 100644
--- a/nnpdfcpp/src/nnfit/src/apfelevol.cc
+++ b/nnpdfcpp/src/nnfit/src/apfelevol.cc
@@ -38,6 +38,7 @@ APFELSingleton::APFELSingleton():
void APFELSingleton::Initialize(NNPDFSettings const& set, PDFSet *const& pdf)
{
+ /*
// Check APFEL
bool check = APFEL::CheckAPFEL();
if (check == false)
@@ -45,6 +46,7 @@ void APFELSingleton::Initialize(NNPDFSettings const& set, PDFSet *const& pdf)
std::cout << Colour::FG_RED << "[CheckAPFEL] ERROR, test not succeeded!" << std::endl;
std::exit(-1);
}
+ */
// initialize attributes
getInstance()->fPDF = pdf;
@@ -273,10 +275,11 @@ void APFELSingleton::Initialize(NNPDFSettings const& set, PDFSet *const& pdf)
APFEL::SetQLimits(getInstance()->fQ0, getInstance()->fQmax + 1E-5); // Epsilon for limits
APFEL::SetNumberOfGrids(1);
- APFEL::SetExternalGrid(1, 195, 5, X1);
+ //APFEL::SetExternalGrid(1, 195, 5, X1);
+ APFEL::SetGridParameters(1, 50, 5, 1e-10);
APFEL::LockGrids(true);
APFEL::SetPDFSet("external");
- APFEL::SetFastEvolution(false);
+ //APFEL::SetFastEvolution(false);
APFEL::InitializeAPFEL();
That seems to have outputted all of the expected files, would you like me to send you anything?
Good, please send by mail the folder. Could you please rerun with APFEL::SetFastEvolution(false);
uncommented, if it works uncomment the checkapfel too? I would like to isolate the problem, I am pretty sure it is correlated to the external grid.
ok sure
Uncommenting APFEL::SetFastEvolution(false);
appears to have introduced the error
Good, could you please revert to the original version and just comment the fast evolution?
This is really weird I'm getting sporadic seg faults
ok I think it was just the cluster not behaving correctly/me requesting too much memory per core. They appear to be running now
Ok, when they are done please let me know if the fast evolution is the origin of the problem.
Hello,
Having an issue running fits on the cluster at Edinburgh, the fits appear to be finishing however they are not outputting the LHAPDF grids, so we don't get the results.
@nhartland suggests it is a problem concerning APFEL.
I have pasted below the final bit of output from the fit, I have the full outputs available for a few different configurations of fits: