Open comane opened 5 months ago
Hi @comane , Thanks for starting the PR. I am testing it at the moment with this data:
dataset_inputs:
- {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table
- {dataset: EIC_NC_EPD_88_PES, frac: 0.75} # New FK table
I am working with theory 270. I have manually added this FK table:
simunet-dev/share/NNPDF/data/theory_270/fastkernel/EIC_NC_EPD_88_PES.pineappl.lz4
and this compound file (using the old way):
simunet-dev/share/NNPDF/data/theory_270/compound/FK_EIC_NC_EPD_88_PES-COMPOUND.dat
Here is the content of the compound file:
# COMPOUND FK
FK: EIC_NC_EPD_88_PES
OP: NULL
When I vp-setupfit
it seems to be looking for the wrong name:
(simunet-dev) ~/Projects/Low_E_PDF/low-energy/Fits/ - (main) > vp-setupfit test_simunet_EIC.yaml
[WARNING]: Output folder exists: /Users/eliehammou/Projects/Low_E_PDF/low-energy/Fits/test_simunet_EIC Overwriting contents
[WARNING]: Using q2min from runcard
[WARNING]: Using w2min from runcard
[ERROR]: Bad configuration encountered:
Incorrect COMPOUND file '/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/compound/FK_EIC_NC_EPD_88_PES-COMPOUND.dat'. Searching for non-existing FKTable:
Could not find FKTable for set '_NC_EPD_88'. File '/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/fastkernel/FK__NC_EPD_88.dat' not found
It looks like it is messing up with both the prefix and the suffix. It is due to the fact that the old format had the following naming convention for FK tables:
FK_EIC_NC_EPD_88_PES.dat
For the record, is this PR relying on the old compound files to link commondata and FK tables or is it expecting the info to be stored in the yamldb folder of the theory, like nnpdf does currently?
dataset_inputs: - {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table - {dataset: EIC_NC_EPD_88_PES, frac: 0.75} # New FK table
Can you try adding the new_commondata: true
flag to the dataset that makes use of the FKtable in the pineappl format.
For the record, is this PR relying on the old compound files to link commondata and FK tables or is it expecting the info to be stored in the yamldb folder of the theory, like nnpdf does currently?
I don't think that this PR supports compounds yet
Sure thing.
I have just tried vp-setupfit
with:
dataset_inputs:
- {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table
- {dataset: EIC_NC_EPD_88_PES, frac: 0.75, new_commondata: true} # New FK table
I have also removed the compound file I had initially added. It gives me the following error:
(simunet-dev) ~/Projects/Low_E_PDF/low-energy/Fits/ - (main) > vp-setupfit test_simunet_EIC.yaml
[WARNING]: Output folder exists: /Users/eliehammou/Projects/Low_E_PDF/low-energy/Fits/test_simunet_EIC Overwriting contents
[WARNING]: Using q2min from runcard
[WARNING]: Using w2min from runcard
[CRITICAL]: Bug in setup-fit ocurred. Please report it.
Traceback (most recent call last):
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/loader.py", line 405, in check_compound
with compound_spec_path.open() as f:
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/pathlib.py", line 1119, in open
return self._accessor.open(self, mode, buffering, encoding, errors,
FileNotFoundError: [Errno 2] No such file or directory: '/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/compound/FK_EIC_NC_EPD_88_PES-COMPOUND.dat'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/loader.py", line 590, in check_dataset
fkspec, op = self.check_compound(theoryno, name, cfac)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/loader.py", line 412, in check_compound
raise CompoundNotFound(msg)
validphys.loader.CompoundNotFound: Could not find COMPOUND set 'EIC_NC_EPD_88_PES' for theory 270: [Errno 2] No such file or directory: '/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/compound/FK_EIC_NC_EPD_88_PES-COMPOUND.dat'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/eliehammou/Software/simunet_git/SIMUnet/n3fit/src/n3fit/scripts/vp_setupfit.py", line 197, in run
super().run()
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/app.py", line 158, in run
super().run()
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/app.py", line 358, in run
rb.resolve_fuzzytargets()
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 370, in resolve_fuzzytargets
self.resolve_fuzzytarget(target)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 379, in resolve_fuzzytarget
self.process_targetspec(fuzzytarget.name, spec, fuzzytarget.extraargs)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 388, in process_targetspec
gen.send(None)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 450, in _process_requirement
yield from self._make_node(name, nsspec, extraargs, parents)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 466, in _make_node
yield from self._make_callspec(f, name, nsspec, extraargs, parents)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 499, in _make_callspec
index, _ = gen.send(None)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 417, in _process_requirement
put_index, val = self.input_parser.resolve_key(name, ns, parents=parents, currspec=nsspec)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 429, in resolve_key
return self._resolve_key(key=key, ns=ns, input_params=input_params,
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 491, in _resolve_key
val = produce_func(**kwargs)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/config.py", line 1492, in produce_data
datasets.append(self.parse_from_(None, "dataset", write=False)[1])
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 133, in f_
return f(self, val, *args, **kwargs)
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 735, in parse_from_
return self.resolve_key(element, ns, input_params=input_params,
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 429, in resolve_key
return self._resolve_key(key=key, ns=ns, input_params=input_params,
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/configparser.py", line 491, in _resolve_key
val = produce_func(**kwargs)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/config.py", line 754, in produce_dataset
ds = self.loader.check_dataset(
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/loader.py", line 592, in check_dataset
fkspec = self.check_fktable(theoryno, name, cfac, use_fixed_predictions=use_fixed_predictions, new_commondata=new_commondata)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/loader.py", line 386, in check_fktable
with open(path_metadata, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/fastkernel/EIC_NC_EPD_88_PES_metadata.yaml'
It appears to complain about the absence of the compound file and the metadata file. The metadata file makes sense since I am using an old commondata implementation with a new FK table.
I will implement the metadata or try with a dataset which has it already implemented and come back to you. I am confused about the compound error though.
Hi @comane , I think I have found a bug, it appears that the new FK tables cannot be read if another dataset if being contaminated. For example, the following runcard works well:
dataset_inputs:
- {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table
- {dataset: EIC_CC_EMP_140_OPT, frac: 0.75, new_commondata: true} # New FK table
But if I add another dataset to be contaminated, the vp-setupfit steps bugs out:
dataset_inputs:
- {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table
- {dataset: HLLHC_HMDY_NC_EL_FINAL, frac: 0.75, cfac: ['QCD', 'EWK'], contamination: 'EFT_LO'}
- {dataset: EIC_CC_EMP_140_OPT, frac: 0.75, new_commondata: true} # New FK table
I have then the following error:
(simunet-dev) ~/Projects/Low_E_PDF/low-energy/Fits/ - (main) > vp-setupfit test_simunet_EIC.yaml
[WARNING]: Output folder exists: /Users/eliehammou/Projects/Low_E_PDF/low-energy/Fits/test_simunet_EIC Overwriting contents
[WARNING]: Using q2min from runcard
[WARNING]: Using w2min from runcard
Using Keras backend
[INFO]: All requirements processed and checked successfully. Executing actions.
[WARNING]: Importing libNNPDF
[INFO]: Initialising RNG
- Random Generator allocated: ranlux
[INFO]: NNPDF40_nnlo_as_01180 T0 checked.
[INFO]: Verifying positivity tables:
[INFO]: POSF2U checked.
[INFO]: POSF2DW checked.
[INFO]: POSF2S checked.
[INFO]: POSFLL checked.
[INFO]: POSDYU checked.
[INFO]: POSDYD checked.
[INFO]: POSDYS checked.
[INFO]: POSF2C checked.
[INFO]: POSXUQ checked.
[INFO]: POSXUB checked.
[INFO]: POSXDQ checked.
[INFO]: POSXDB checked.
[INFO]: POSXSQ checked.
[INFO]: POSXSB checked.
[INFO]: POSXGL checked.
-- Generating closure data for DEUTERON
-- Generating replica data for DEUTERON
[WARNING]: Dataset output folder exists: /Users/eliehammou/Projects/Low_E_PDF/low-energy/Fits/test_simunet_EIC/filter/NMCPD_dw_ite Overwriting contents
[INFO]: 121/260 datapoints in NMCPD_dw_ite passed kinematic cuts.
-- Generating closure data for HLLHC
-- Generating replica data for HLLHC
[INFO]: 12/12 datapoints in HLLHC_HMDY_NC_EL_FINAL passed kinematic cuts.
[CRITICAL]: Bug in setup-fit ocurred. Please report it.
Traceback (most recent call last):
File "/Users/eliehammou/Software/simunet_git/SIMUnet/n3fit/src/n3fit/scripts/vp_setupfit.py", line 197, in run
super().run()
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/app.py", line 158, in run
super().run()
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/app.py", line 380, in run
rb.execute_sequential()
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 166, in execute_sequential
result = self.get_result(callspec.function,
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/reportengine/resourcebuilder.py", line 175, in get_result
fres = function(**kwdict)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/filters.py", line 122, in filter_closure_data_by_experiment
return [
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/filters.py", line 123, in <listcomp>
_filter_closure_data(filter_path, exp, t0pdfset, fakenoise, errorsize)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/filters.py", line 177, in _filter_closure_data
loaded_data = data.load.__wrapped__(data)
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/core.py", line 774, in load
loaded_data = dataset.load()
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/core.py", line 584, in load
fktable = p.load()
File "/Users/eliehammou/Software/simunet_git/SIMUnet/validphys2/src/validphys/core.py", line 702, in load
return FKTable(str(self.fkpath), [str(factor) for factor in self.cfactors])
File "/Users/eliehammou/miniconda3/envs/simunet-dev/lib/python3.10/site-packages/NNPDF/nnpdf.py", line 3042, in __init__
_nnpdf.FKTable_swiginit(self, _nnpdf.new_FKTable(*args))
RuntimeError: [utils] error: Could not open (PosixPath('/Users/eliehammou/miniconda3/envs/simunet-dev/share/NNPDF/data/theory_270/fastkernel/EIC_CC_EMP_140_OPT.pineappl.lz4'),)
I have similar problems with validphys runcards.
I have no idea what the problem can be to be honest
I think I understand the issue. The contamination itself is not the issue, the new FK tables do not work in a closure test.
This runcard produces a bug for instance:
dataset_inputs:
- {dataset: NMCPD_dw_ite, frac: 0.75} # Old FK table
- {dataset: EIC_CC_EMP_140_OPT, frac: 0.75, new_commondata: true} # New FK table
###########################################################
# The closure test namespace tells us the settings for the
# (possible contaminated) closure test.
############################################################
closuretest:
filterseed: 0 # Random seed to be used in filtering data partitions
fakedata: true # true = to use FAKEPDF to generate pseudo-data
fakepdf: NNPDF40_nnlo_as_01180 # Theory input for pseudo-data
errorsize: 1.0 # uncertainties rescaling
fakenoise: true # true = to add random fluctuations to pseudo-data
rancutprob: 1.0 # Fraction of data to be included in the fit
rancutmethod: 0 # Method to select rancutprob data fraction
rancuttrnval: false # 0(1) to output training(valiation) chi2 in report
printpdf4gen: false # To print info on PDFs during minimization
# contamination_parameters:
# - name: 'W'
# value: 0.00008
# linear_combination:
# 'Olq3': -15.94
seed: 0
rngalgo: 0
The bug disappears if I comment out the closure test key.
I think I understand the issue. The contamination itself is not the issue, the new FK tables do not work in a closure test.
Yes, exactly. As I had already commented in the description above, this PR still not supports the filtering of closure test data when using the new pine parser. It's in the TODO list above. Thanks for pointing this out again
The scope of this PR is to allow SIMUnet to use new theories generated with pineappl.
Example on how to use this for the moment:
from
theory_700/fast_kernel
copy NMC_NC_NOTFIXED_P_EM-SIGMARED.pineappl.lz4 into theory_270/fast_kernelfrom nnpdf_data/new_commondata/NMC_NC_NOTFIXED_P copy the metadata.yaml as
NMC_NC_NOTFIXED_P_EM-SIGMARED_metadata.yaml
intotheory_700/fast_kernel
copy
DATA_NMC.dat
intoDATA_NMC_NC_NOTFIXED_P_EM-SIGMARED.dat
withindata/commondata
Now, it should be possible to run a fit with the following dataset_inputs:
TODO