NNPDF / nnpdf

An open-source machine learning framework for global analyses of parton distributions.
GNU General Public License v3.0
28 stars 6 forks source link

Have a fixed environment for paper #1126

Closed wilsonmr closed 3 years ago

wilsonmr commented 3 years ago

Minimally something like this:

activate <environment-name>
conda env export > <environment-name>.yml

#For other person to use the environment
conda env create -f <environment-name>.yml

But perhaps we also want to be able to quickly verify that the person that ran some results did indeed use the right settings. If we had an action that could do this and output a tag or False where the tag is some kind of version of the environment and False would be if the check fails. Then you could see on a report it would say or perhaps we could save the tag in the folder when we upload to server so it applies to fits, pdfs or reports.

Edit by @Zaharid : Please see this comment for instructions on installing the environment https://github.com/NNPDF/nnpdf/issues/1126#issuecomment-797117499

wilsonmr commented 3 years ago

cc: @scarrazza

wilsonmr commented 3 years ago

It's very easy to do the minimum thing and upload the yml to the wiki. I think some kind of verification on uploaded resources would also be nice.

wilsonmr commented 3 years ago

...Not that I don't trust that people would use the canonical environment but I'd like to make sure that I did and we could look back at old* results and see what the environment was at that point (if they used a conda installation)

*future old results

Zaharid commented 3 years ago

Right. I'd suggest waiting for 4.0 to build and see what we get on linux for conda create -n nndeploy nnpdf=4 python=3.7 and mandate that fits are done with that.

Zaharid commented 3 years ago

cc @enocera

Zaharid commented 3 years ago

Right, could anybody else please test minimally the attached environment on linux (see @wilsonmr 's instructions above).

(renamed the thing as txt so it will let me upload it here)

nn4deploy.txt AFAICT it installs cleanly and all tests pass.

cc @enocera @scarlehoff @scarrazza @tgiani

Edit: removed old environment.

Zaharid commented 3 years ago

As soon as I get some confirmation, I plan to send an email containing some RFC 2119 lingo:

So this would be a good time to voice any comments/concerns.

Zaharid commented 3 years ago

On a related topic I'll try to dump the environment somewhere on the server (possibly on top of a container). I'd expect that to make it reproducible for a long as easy access to x86 linux exists, which I'd expect to be at keast century, or the end of civilization, whatever comes first.

wilsonmr commented 3 years ago

no new packages can be installed and no development versions can be used.

I had to break this rule in order to run the tests. But otherwise everything passes for me

Would be good to get some other confirmations though

short version:

$ conda env create -f nn4deploy.yml
(/scratch/conda_envs/nn4deploy) -bash-4.2$ conda install hypothesis pytest coverage pytest-mpl
Collecting package metadata: done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.6.14
  latest version: 4.9.2

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /scratch/conda_envs/nn4deploy

  added / updated specs:
    - coverage
    - hypothesis
    - pytest
    - pytest-mpl

The following packages will be downloaded:

    package                    |            build
    hypothesis-6.7.0           |     pyhd3eb1b0_0         236 KB
    more-itertools-8.7.0       |     pyhd3eb1b0_0          39 KB
    pytest-6.2.2               |   py37h06a4308_2         453 KB
                                           Total:         728 KB

The following NEW packages will be INSTALLED:

  hypothesis         pkgs/main/noarch::hypothesis-6.7.0-pyhd3eb1b0_0
  importlib_metadata pkgs/main/noarch::importlib_metadata-2.0.0-1
  iniconfig          pkgs/main/noarch::iniconfig-1.1.1-pyhd3eb1b0_0
  more-itertools     pkgs/main/noarch::more-itertools-8.7.0-pyhd3eb1b0_0
  pluggy             pkgs/main/linux-64::pluggy-0.13.1-py37_0
  py                 pkgs/main/noarch::py-1.10.0-pyhd3eb1b0_0
  pytest             pkgs/main/linux-64::pytest-6.2.2-py37h06a4308_2
  pytest-mpl         conda-forge/noarch::pytest-mpl-0.12-pyhd3deb0d_0
  sortedcontainers   pkgs/main/noarch::sortedcontainers-2.3.0-pyhd3eb1b0_0
  toml               pkgs/main/noarch::toml-0.10.1-py_0

Proceed ([y]/n)? y
(/scratch/conda_envs/nn4deploy) -bash-4.2$ pytest --pyargs validphys
-- Docs: https://docs.pytest.org/en/stable/warnings.html
================= 95 passed, 18 warnings in 207.85s (0:03:27) ==================
(/scratch/conda_envs/nn4deploy) -bash-4.2$ pytest --pyargs n3fit
============ 38 passed, 1 skipped, 2 warnings in 168.82s (0:02:48) =============
Zaharid commented 3 years ago

no new packages can be installed and no development versions can be used.

I had to break this rule in order to run the tests. But otherwise everything passes for me

Same, was wondering if someone would notice that. But then again this is applicable to production environments.

tgiani commented 3 years ago

ok for me it failed but I guess it s my bad.. I ll do what the error message says and I ll try again

(nn4deploy) tommy@tommy-XPS-13-9380:~$ pytest --pyargs validphys
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.7.10, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
Matplotlib: 3.3.4
Freetype: 2.10.4
rootdir: /home/tommy
plugins: mpl-0.12, hypothesis-6.7.0
collected 95 items                                                                                                                                                                                         

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_arclengths.py ..                                                                                                          [  2%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_calcutils.py .                                                                                                            [  3%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py ..                                                                                                         [  5%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_commondataparser.py ..                                                                                                    [  7%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_covmatreg.py .....                                                                                                        [ 12%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_covmats.py ............                                                                                                   [ 25%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_cuts.py .                                                                                                                 [ 26%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_effexponents.py .                                                                                                         [ 27%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_filter_rules.py ....                                                                                                      [ 31%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_fitdata.py .                                                                                                              [ 32%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_fitveto.py ..                                                                                                             [ 34%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_loader.py .                                                                                                               [ 35%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_metaexps.py .                                                                                                             [ 36%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_plots.py ...                                                                                                              [ 40%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_postfit.py .                                                                                                              [ 41%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pseudodata.py .....                                                                                                       [ 46%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pyfkdata.py ....                                                                                                          [ 50%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pythonmakereplica.py ..................                                                                                   [ 69%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_rebuilddata.py .                                                                                                          [ 70%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_regressions.py ...........                                                                                                [ 82%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py .F                                                                                                         [ 84%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_theorydbutils.py ...                                                                                                      [ 87%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_totalchi2.py ..                                                                                                           [ 89%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_utils.py .                                                                                                                [ 90%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_vplistscript.py ......                                                                                                    [ 96%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_weights.py ...                                                                                                            [100%]

================================================================================================= FAILURES =================================================================================================
___________________________________________________________________________________________ test_extrasum_slice ____________________________________________________________________________________________

args = ('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv',), kwargs = {}, saved_exception = LoadFailedError("Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'")

    def f(*args, **kwargs):
            return orig(*args, **kwargs)
        except LoadFailedError as e:
            saved_exception = e
            log.info("Could not find a resource "
                f"({resource}): {saved_exception}. "
                f"Attempting to download it.")
>               download(*args, **kwargs)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <validphys.loader.FallbackLoader object at 0x7fdcc4428290>, filename = PosixPath('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'), kwargs = {}, root_url = 'https://vp.nnpdf.science/'
url = 'https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'

    def download_vp_output_file(self, filename, **kwargs):
            root_url = self.nnprofile['reports_root_url']
        except KeyError as e:
            raise LoadFailedError('Key report_root_url not found in nnprofile')
            url = root_url  + filename
        except Exception as e:
            raise LoadFailedError(e) from e
            filename = pathlib.Path(filename)

>           download_file(url, self._vp_cache()/filename, make_parents=True)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

url = 'https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'
stream_or_path = PosixPath('/home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'), make_parents = True

    def download_file(url, stream_or_path, make_parents=False):
        """Download a file and show a progress bar if the INFO log level is
        enabled. If ``make_parents`` is ``True`` ``stream_or_path``
        is path-like, all the parent folders will
        be created."""
        #There is a bug in CERN's
        #Apache that incorrectly sets the Content-Encodig header to gzip, even
        #though it doesn't compress two times.
        # See: http://mail-archives.apache.org/mod_mbox/httpd-dev/200207.mbox/%3C3D2D4E76.4010502@talex.com.pl%3E
        # and e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=610679#c30
        #If it looks like the url is already encoded, we do not request
        #it to be compressed
        headers = {}
        if mimetypes.guess_type(url)[1] is not None:
            headers['Accept-Encoding'] = None

        response = requests.get(url, stream=True, headers=headers)

>       response.raise_for_status()

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Response [401]>

    def raise_for_status(self):
        """Raises :class:`HTTPError`, if one occurred."""

        http_error_msg = ''
        if isinstance(self.reason, bytes):
            # We attempt to decode utf-8 first because some servers
            # choose to localize their reason strings. If the string
            # isn't utf-8, we fall back to iso-8859-1 for all other
            # encodings. (See PR #3538)
                reason = self.reason.decode('utf-8')
            except UnicodeDecodeError:
                reason = self.reason.decode('iso-8859-1')
            reason = self.reason

        if 400 <= self.status_code < 500:
            http_error_msg = u'%s Client Error: %s for url: %s' % (self.status_code, reason, self.url)

        elif 500 <= self.status_code < 600:
            http_error_msg = u'%s Server Error: %s for url: %s' % (self.status_code, reason, self.url)

        if http_error_msg:
>           raise HTTPError(http_error_msg, response=self)
E           requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/requests/models.py:943: HTTPError

The above exception was the direct cause of the following exception:

    def test_extrasum_slice():
        l = Loader()
>       f =  l.check_vp_output_file('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv')

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:918: in f
    raise saved_exception from e
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:902: in f
    return orig(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <validphys.loader.FallbackLoader object at 0x7fdcc4428290>, filename = 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'
extra_paths = ('.', PosixPath('/home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache'))

    def check_vp_output_file(self, filename, extra_paths=('.',)):
        """Find a file in the vp-cache folder, or (with higher priority) in
        the ``extra_paths``."""
            vpcache = self._vp_cache()
        except KeyError as e:
            log.warning("Entry validphys_cache_path expected but not found "
                     "in the nnprofile.")
            extra_paths = (*extra_paths, vpcache)

        finder = filefinder.FallbackFinder(extra_paths)
            path, name = finder.find(filename)
        except FileNotFoundError as e:
>           raise LoadFailedError(f"Could not find '{filename}'") from e
E           validphys.loader.LoadFailedError: Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:520: LoadFailedError
------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
-------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------
INFO     validphys.loader:loader.py:109 Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
INFO     validphys.loader:loader.py:905 Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
ERROR    validphys.loader:loader.py:879 Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

ERROR    validphys.loader:loader.py:917 There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
============================================================================================= warnings summary =============================================================================================
  /home/tommy/miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py:12: PytestCollectionWarning: cannot collect test class 'TestResult' because it has a __init__ constructor (from: miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py)
    class TestResult:

  /home/tommy/miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tableloader.py:89: FutureWarning: inplace is deprecated and will be removed in a future version.
    cols.set_levels(new_levels, inplace=True, level=0)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
========================================================================================= short test summary info ==========================================================================================
FAILED miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py::test_extrasum_slice - validphys.loader.LoadFailedError: Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables...
=========================================================================== 1 failed, 94 passed, 3 warnings in 196.76s (0:03:16) ===========================================================================
Thanks for using LHAPDF 6.3.0. Please make sure to cite the paper:
  Eur.Phys.J. C75 (2015) 3, 132  (http://arxiv.org/abs/1412.7420)
tgiani commented 3 years ago

works for me. For validphys I get

=============================================================================== 95 passed, 11 warnings in 173.22s (0:02:53) ================================================================================

and for n3fit

========================================================================== 38 passed, 1 skipped, 2 warnings in 275.25s (0:04:35) ===========================================================================
scarlehoff commented 3 years ago

Works also for me.

scarlehoff commented 3 years ago

That's for TF 2.2.

That said, inspecting the environment I see that mkl is (once again) the default and that the version has been bumped to 2.4.1 as recently as 3 days ago. Before running any final fits let me ensure that the eigen and the mkl version are doing the same thing this time.

Edit: even if they work the same I would be more comfortable at this point with the eigen build.

RoyStegeman commented 3 years ago

Yes that's why I removed it, since that link I gave wasn't any good. But looking at the dependencies of the tf 2.4.1 pypi package, it also shows gast==0.3.3 as a requirement.


scarlehoff commented 3 years ago

It seems this MKL version has the famous memory growth bug, so at least that needs to be changed, sigh.

It seems there was a few packages in that situation for 2.3. It would be nice of the Conda maintainers to test the effect of their choices: https://github.com/AnacondaRecipes/tensorflow_recipes/issues/27 ...

Personally I would prefer to fix the version of all packages to the ones Roy linked.

RoyStegeman commented 3 years ago

It seems this MKL version has the famous memory growth bug, so at least that needs to be changed, sigh.

Conda also provides the eigen version, so that could just be changed in the environment file.

scarlehoff commented 3 years ago


Could you create the environment with:

tensorflow==2.4.1=eigen_py37h3da6045_0 gast==0.3.3 opt_einsum==3.3.0 # this one is from conda forge

Sorry I could only test it now!

Zaharid commented 3 years ago

I thought we had TF eingen?! Should have looked into it.

scarlehoff commented 3 years ago

No, at some point eigen was given priority and I asked whether it would be like that also in the future and mistakenly took the lack of response as a confirmation. My bad. I guess if we don't pin it the version is basically random.

Zaharid commented 3 years ago

Could someone make a PR adding the the right constraints so it is not possible to get broken packages.

RoyStegeman commented 3 years ago

Could you create the environment with:

tensorflow==2.4.1=eigen_py37h3da6045_0 gast==0.3.3 opt_einsum==3.3.0 # this one is from conda forge

Won't conda complain if we try to create an env with both tensorflow==2.4.1 and gast==0.3.3? Until now I used the pip version of tensorflow inside the conda env, but in this case that's not an ideal solution.

scarlehoff commented 3 years ago

Don't know. Travis will tell me https://github.com/NNPDF/nnpdf/pull/1143

Zaharid commented 3 years ago

@RoyStegeman why do you need that specific version of gast. Conda doesn't seem to like it...

scarlehoff commented 3 years ago

TensorFlow asks for it and looking through their commits it seems that using a newer version of gast does require some changes... they might not change anything for us but if it can be pinned it would be best.

RoyStegeman commented 3 years ago

Yes exactly what Juan says, tf 2.4.1 asks for that specific version of gast. I don't know if having a different version will affect the behaviour in our case, but because I don't know I would prefer the version that tensorflow asks.

Zaharid commented 3 years ago

@scarlehoff @RoyStegeman so just to be clear, you are claiming that the conda metadata is wring in giving me gast 0.4.0?

RoyStegeman commented 3 years ago

Yes, It's nicely listed in the issue that Juan shared: https://github.com/AnacondaRecipes/tensorflow_recipes/issues/27

The pip and conda dependencies are conflicting. Of course that's about tf2.3 but the issue still exists for tf2.4.1.

scarlehoff commented 3 years ago

@scarlehoff @RoyStegeman so just to be clear, you are claiming that the conda metadata is wring in giving me gast 0.4.0?

Rather that the conda maintainers have decided to remove all pinnings in the last few days and maybe they have tested everything but I don't trust them https://github.com/AnacondaRecipes/tensorflow_recipes/commit/08852b432c586d5539f40c8a3dff443fc55e022e

Zaharid commented 3 years ago

Well to be fair the pinnings of the pip version are crazy (pinning scipy down the the minor version?!). Do you have positive evidence that these are wrong?

scarlehoff commented 3 years ago

Scipy is for tests. But it is pinned only to the major version if I understood it correctly.

Zaharid commented 3 years ago

I am looking at this thing https://github.com/AnacondaRecipes/tensorflow_recipes/issues/27

RoyStegeman commented 3 years ago

Oh right, no in reality for scipy it'sn not == but ~=: https://github.com/tensorflow/tensorflow/blob/85c8b2a817f95a3e979ecd1ed95bff1dc1335cff/tensorflow/tools/pip_package/setup.py#L129

We use the ~= syntax to pin packages to the latest major.minor release accepting all other patches on top of that.

Zaharid commented 3 years ago

Right I see. Hmm, given the alternatives, my preferred solution would be to just ignore that requirement and use the conda version of gast. Is that known to actually break something we use? Also it is curious that this one commit https://github.com/serge-sans-paille/gast/commit/8d3c5de1064ab59a884e830752c71655b60a6d30 in an 80 commit project can cause so much havoc apparently. Not clear to me what "breaking the ecosytem means". Is that something other than some project that google abandoned so they could go and play foosball?

Zaharid commented 3 years ago

Also sorry that I didn't wait for explicit confirmation from any of you.

RoyStegeman commented 3 years ago

Well, based on Juans PR conda's tensorflow 2.3.0 now does accept gast==0.3.3, so I guess that issue we linked is outdated. It's probably still worth it to check if any other packages disagree between conda and pip when building the environment with the version settings of that PR, but at least the gast conflict is solved.

Zaharid commented 3 years ago

Save the environment below, as env.yaml (note that it malfunctions if the extension is different from yaml)

name: nn4deploy
  - https://packages.nnpdf.science/private
  - https://packages.nnpdf.science/public
  - defaults
  - conda-forge
  - _libgcc_mutex=0.1=main
  - _tflow_select=2.3.0=eigen
  - absl-py=0.12.0=py37h06a4308_0
  - aiohttp=3.7.4=py37h27cfd23_1
  - alabaster=0.7.12=py37_0
  - apfel=
  - astunparse=1.6.3=py_0
  - async-timeout=3.0.1=py37h06a4308_0
  - attrs=20.3.0=pyhd3eb1b0_0
  - babel=2.9.0=pyhd3eb1b0_0
  - blas=1.0=mkl
  - blessings=1.7=py37h06a4308_1002
  - blinker=1.4=py37h06a4308_0
  - brotlipy=0.7.0=py37h27cfd23_1003
  - bzip2=1.0.8=h7b6447c_0
  - c-ares=1.17.1=h27cfd23_0
  - ca-certificates=2021.4.13=h06a4308_1
  - cachetools=4.2.1=pyhd3eb1b0_0
  - certifi=2020.12.5=py37h06a4308_0
  - cffi=1.14.5=py37h261ae71_0
  - chardet=3.0.4=py37h06a4308_1003
  - click=7.1.2=pyhd3eb1b0_0
  - cloudpickle=1.6.0=py_0
  - colorama=0.4.4=pyhd3eb1b0_0
  - commonmark=0.9.1=py_0
  - coverage=5.5=py37h27cfd23_2
  - cryptography=3.3.1=py37h3c74f83_1
  - curio=0.9+git.49=py37_0
  - cycler=0.10.0=py37_0
  - cython=0.29.22=py37h2531618_0
  - dbus=1.13.18=hb2f20db_0
  - decorator=4.4.2=pyhd3eb1b0_0
  - docutils=0.16=py37_1
  - expat=2.2.10=he6710b0_2
  - fontconfig=2.13.1=h6c09931_0
  - freetype=2.10.4=h5ab3b9f_0
  - future=0.18.2=py37_1
  - gast=0.3.3=py_0
  - glib=2.67.4=h36276a3_1
  - google-auth=1.27.1=pyhd3eb1b0_0
  - google-auth-oauthlib=0.4.3=pyhd3eb1b0_0
  - google-pasta=0.2.0=py_0
  - grpcio=1.36.1=py37h2157cd5_1
  - gsl=2.4=h14c3975_4
  - gst-plugins-base=1.14.0=h8213a91_2
  - gstreamer=1.14.0=h28cd5cc_2
  - h5py=2.10.0=py37hd6299e0_1
  - hdf5=1.10.6=hb1b8bf9_0
  - hyperopt=0.2.5=pyh9f0ad1d_0
  - icu=58.2=he6710b0_3
  - idna=2.10=pyhd3eb1b0_0
  - imagesize=1.2.0=pyhd3eb1b0_0
  - importlib-metadata=2.0.0=py_1
  - intel-openmp=2020.2=254
  - jinja2=2.11.3=pyhd3eb1b0_0
  - jpeg=9b=h024ee3a_2
  - keras-preprocessing=1.1.2=pyhd3eb1b0_0
  - kiwisolver=1.3.1=py37h2531618_0
  - lcms2=2.11=h396b838_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - lhapdf=6.3.0=py37h6bb024c_1
  - libarchive=3.4.2=h62408e4_0
  - libedit=3.1.20191231=h14c3975_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libprotobuf=3.14.0=h8c45485_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.1.0=h2733197_1
  - libuuid=1.0.3=h1bed415_2
  - libxcb=1.14=h7b6447c_0
  - libxml2=2.9.10=hb55368b_3
  - lz4-c=1.9.3=h2531618_0
  - markdown=3.3.4=py37h06a4308_0
  - markupsafe=1.1.1=py37h14c3975_1
  - matplotlib=3.3.4=py37h06a4308_0
  - matplotlib-base=3.3.4=py37h62a2d02_0
  - mkl=2020.2=256
  - mkl-service=2.3.0=py37he8ac12f_0
  - mkl_fft=1.3.0=py37h54f3939_0
  - mkl_random=1.1.1=py37h0573a6f_0
  - multidict=5.1.0=py37h27cfd23_2
  - ncurses=6.2=he6710b0_1
  - networkx=2.5=py_0
  - nnpdf=
  - numpy=1.18.5=py37ha1c710e_0
  - numpy-base=1.18.5=py37hde5b4d6_0
  - oauthlib=3.1.0=py_0
  - olefile=0.46=py37_0
  - openssl=1.1.1k=h27cfd23_0
  - opt_einsum=3.1.0=py_0
  - packaging=20.9=pyhd3eb1b0_0
  - pandas=1.2.3=py37ha9443f7_0
  - pandoc=2.11=hb0f4dca_0
  - pcre=8.44=he6710b0_0
  - pillow=8.1.2=py37he98fc37_0
  - pip=21.0.1=py37h06a4308_0
  - pkg-config=0.29.2=h1bed415_8
  - prompt-toolkit=3.0.8=py_0
  - prompt_toolkit=3.0.8=0
  - protobuf=3.14.0=py37h2531618_1
  - psutil=5.8.0=py37h27cfd23_1
  - pyasn1=0.4.8=py_0
  - pyasn1-modules=0.2.8=py_0
  - pycparser=2.20=py_2
  - pygments=2.8.1=pyhd3eb1b0_0
  - pyjwt=1.7.1=py37_0
  - pymongo=3.11.3=py37h2531618_0
  - pyopenssl=20.0.1=pyhd3eb1b0_1
  - pyparsing=2.4.7=pyhd3eb1b0_0
  - pyqt=5.9.2=py37h05f1152_2
  - pysocks=1.7.1=py37_1
  - python=3.7.10=hdb3f193_0
  - python-dateutil=2.8.1=pyhd3eb1b0_0
  - pytz=2021.1=pyhd3eb1b0_0
  - qt=5.9.7=h5867ecd_1
  - readline=8.1=h27cfd23_0
  - recommonmark=0.6.0=py_0
  - reportengine=0.30.1=py_0
  - requests=2.25.1=pyhd3eb1b0_0
  - requests-oauthlib=1.3.0=py_0
  - rsa=4.7.2=pyhd3eb1b0_1
  - ruamel_yaml=0.15.87=py37h7b6447c_1
  - scipy=1.4.1=py37h0b6359f_0
  - seaborn=0.11.1=pyhd3eb1b0_0
  - setuptools=52.0.0=py37h06a4308_0
  - sip=4.19.8=py37hf484d3e_0
  - six=1.15.0=py37h06a4308_0
  - snowballstemmer=2.1.0=pyhd3eb1b0_0
  - sphinx=3.5.2=pyhd3eb1b0_0
  - sphinx_rtd_theme=0.5.1=pyhd3deb0d_0
  - sphinxcontrib-applehelp=1.0.2=pyhd3eb1b0_0
  - sphinxcontrib-devhelp=1.0.2=pyhd3eb1b0_0
  - sphinxcontrib-htmlhelp=1.0.3=pyhd3eb1b0_0
  - sphinxcontrib-jsmath=1.0.1=pyhd3eb1b0_0
  - sphinxcontrib-qthelp=1.0.3=pyhd3eb1b0_0
  - sphinxcontrib-serializinghtml=1.1.4=pyhd3eb1b0_0
  - sqlite=3.35.4=hdfb4753_0
  - tensorboard=2.4.0=pyhc547734_0
  - tensorboard-plugin-wit=1.6.0=py_0
  - tensorflow=2.3.0=eigen_py37h189e6a2_0
  - tensorflow-base=2.3.0=eigen_py37h3b305d7_0
  - tensorflow-estimator=2.3.0=pyheb71bc4_0
  - termcolor=1.1.0=py37h06a4308_1
  - tk=8.6.10=hbc83047_0
  - tornado=6.1=py37h27cfd23_0
  - tqdm=4.56.0=pyhd3eb1b0_0
  - typing-extensions=
  - typing_extensions=
  - urllib3=1.26.3=pyhd3eb1b0_0
  - wcwidth=0.2.5=py_0
  - werkzeug=1.0.1=pyhd3eb1b0_0
  - wheel=0.36.2=pyhd3eb1b0_0
  - wrapt=1.12.1=py37h7b6447c_1
  - xz=5.2.5=h7b6447c_0
  - yaml=0.2.5=h7b6447c_0
  - yaml-cpp=0.6.0=h6bb024c_4
  - yarl=1.6.3=py37h27cfd23_0
  - zipp=3.4.0=pyhd3eb1b0_0
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.4.5=h9ceee32_0
  - pip:
    - validphys==4.0
prefix: /home/zah/anaconda3/envs/nn4deploy

and install with

conda env create --force --file env.yaml

Run fits with the environment activated (i.e. conda activate nndeploy).

RoyStegeman commented 3 years ago

Actually, I think I would go with this: environment.txt

Zaharid commented 3 years ago

Ok, that makes more sense.

On 12 Mar 2021, at 01:15, Roy Stegeman @.***> wrote:

 Actually, I think I would go with this: environment.txt

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Zaharid commented 3 years ago

I understand that we have mostly tested TF 2.3 so I'd rather go with that. @scarlehoff ?

RoyStegeman commented 3 years ago

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server, do we overwrite those? Or should we for now just keep a naming convention with the date in the name to make sure the names are unique and rename them only shortly before they are send to lhapdf?

Also, for LO and NLO fits there are no runcards yet, for some reason I though this was because new theories had to be generated for this but perhaps I misunderstood that? I'm asking because @Zaharid mentioned during the meeting that they had not been run because we did not have an agreed upon environment yet.

wilsonmr commented 3 years ago

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server

The existing fits should be renamed imo rather than fully overwritten, although I'm not sure how many people want to use these fits. For sure the bugged fits shouldn't keep those names

Zaharid commented 3 years ago

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server, do we overwrite those? Or should we for now just keep a naming convention with the date in the name to make sure the names are unique and rename them only shortly before they are send to lhapdf?

This but unironically


We should delete any grids with 4.0 and wait until the very final moment to rename the grids.

Also, for LO and NLO fits there are no runcards yet, for some reason I though this was because new theories had to be generated for this but perhaps I misunderstood that? I'm asking because @Zaharid mentioned during the meeting that they had not been run because we did not have an agreed upon environment yet.

I said that the fits should not be run right now but that was not specifically tied to NLO.

enocera commented 3 years ago

@RoyStegeman LO and NLO theories have already been generated, they should be correctly listed in theory.db, see IDs 208-210 and 212-214.

RoyStegeman commented 3 years ago

@enocera thanks for pointing that out, I didn't realise they had already been generated.

Zaharid commented 3 years ago

Ok, I have updated my comment https://github.com/NNPDF/nnpdf/issues/1126#issuecomment-797117499 with Roy's environment. Note the env file has to be saved with yaml extension for it to work properly. If you have old "nn4deploy" environments, delete them first with conda env remove -n nn4deploy. Remember to actually activate the environment for running fits.

scarlehoff commented 3 years ago

Should we update the bootstrap repository adding a production script that installs directly this environment? (and also, it should be linked in the readme here)

Zaharid commented 3 years ago

I have updated the environment here: https://github.com/NNPDF/nnpdf/issues/1126#issuecomment-797117499 and in the wiki. The only difference should be the versions of the nnpdf package (4.0.1 now) and of sqlite.

Zaharid commented 3 years ago

The NLO fits should now work using this environment and the various runcards in #675 and the wiki.

All LHAPDF fits need to be redone with the latest environment.

Zaharid commented 3 years ago

@RoyStegeman @scarlehoff Could you confirm that the tensorflow related versions in the existing environments are still good? Is there anything that must be bumped?