cgat-developers / cgat-flow

cgat-flow repository
MIT License
13 stars 9 forks source link

Travis tests failing #88

Closed Acribbs closed 5 years ago

Acribbs commented 5 years ago

@AndreasHeger it seems the tests are failing due to a kill issue

The command "travis_wait 30 ./install-devel.sh --install --test --use-repo ." exited with 137.

I assume this is because the conda installation is taking a lot longer than previously?

sebastian-luna-valero commented 5 years ago

Conda's dependency resolution algorithm gets stuck at this step:

conda env update --name cgat-flow --file env-cgat-apps.yml

Normally this happens when you have a large number of dependencies (like we still do) without pinning their versions. The number of combinations of different versions of the packages to satisfy the list of dependencies is so high that the solver does not know what to do. You can see that in action by appending -vv to the conda command as follows:

conda env update --name cgat-flow --file env-cgat-apps.yml -vv

My personal (brute force) approach in the past has been to pin all the packages to the versions that have proved to pass the tests. That way: 1) the dependency resolution algorithm finishes quickly, and 2) we enable reproducibility by releasing together a version of the code with all the versions of the dependencies that pass the tests with that code.

sebastian-luna-valero commented 5 years ago

I have managed to make some progress by updating env-cgat-apps.yml like:

diff env-cgat-apps.yml.orig env-cgat-apps.yml
32c32
< - scipy
---
> - scipy==0.19
53c53
< 
---
> - nomkl

Is the code now ready to work with scipy >= 1.0? We had issues in the past.

On the other hand, once install-devel.sh finishes, the final cgat-flowenvironment does not contain the expected versions of neither Python (which goes back to 2.7) nor R (which goes back to 3.2.2). More precisely:

   defaults::_r-mutex-1.0.0-anacondar_1
    conda-forge::libxslt-1.1.32-h4785a14_1002
    conda-forge::python-2.7.15-h938d71a_1006
    bioconda::alignlib-lite-0.3-py27h09b0a5c_1
    conda-forge::asn1crypto-0.24.0-py27_1003
    conda-forge::atomicwrites-1.3.0-py_0
    conda-forge::attrs-18.2.0-py_0
    conda-forge::backports-1.0-py_2
    conda-forge::backports_abc-0.5-py_1
    conda-forge::certifi-2018.11.29-py27_1000
    conda-forge::chardet-3.0.4-py27_1003
    conda-forge::cython-0.29.5-py27hf484d3e_0
    conda-forge::enum34-1.1.6-py27_1001
    conda-forge::funcsigs-1.0.2-py_3
    conda-forge::functools32-3.2.3.2-py_3
    conda-forge::futures-3.2.0-py27_1000
    conda-forge::greenlet-0.4.13-py27_0
    conda-forge::httplib2-0.12.1-py27_0
    conda-forge::idna-2.8-py27_1000
    conda-forge::ipaddress-1.0.22-py_1
    conda-forge::keepalive-0.5-py_1
    conda-forge::kiwisolver-1.0.1-py27h6bb024c_1002
    conda-forge::lxml-4.3.1-py27h23eabaa_0
    conda-forge::markupsafe-1.1.0-py27h14c3975_1000
    conda-forge::msgpack-python-0.6.1-py27h6bb024c_0
    conda-forge::numpy-1.13.3-py27_blas_openblas_200
    conda-forge::olefile-0.46-py_0
    conda-forge::pep8-1.7.1-py_0
    conda-forge::pluggy-0.8.1-py_0
    conda-forge::py-1.7.0-py_0
    conda-forge::pyasn1-0.4.4-py_1
    conda-forge::pycparser-2.19-py_0
    conda-forge::pyparsing-2.3.1-py_0
    conda-forge::pysocks-1.6.8-py27_1002
    conda-forge::python-drmaa-0.7.6-py27_0
    conda-forge::python-lzo-1.12-py27hb300a8c_1000
    conda-forge::pytz-2018.9-py_0
    conda-forge::pyyaml-3.13-py27h14c3975_1001
    bioconda::quicksect-0.2.0-py27h470a237_1
    bioconda::ruffus-2.8.1-py_0
    conda-forge::scandir-1.9.0-py27h14c3975_1000
    conda-forge::sip-4.18.1-py27hf484d3e_1000
    conda-forge::six-1.12.0-py27_1000
    conda-forge::sortedcontainers-2.1.0-py_0
    conda-forge::sqlalchemy-1.2.18-py27h14c3975_0
    conda-forge::subprocess32-3.5.3-py27h14c3975_0
    bioconda::toposort-1.4-py27_0
    conda-forge::webencodings-0.5.1-py_1
    conda-forge::cffi-1.12.1-py27h9745a5d_0
    conda-forge::cycler-0.10.0-py_1
    bioconda::gevent-1.0.2-py27_0
    conda-forge::gobject-introspection-1.56.1-py27h2da5eee_1002
    conda-forge::html5lib-1.0.1-py_0
    conda-forge::isodate-0.6.0-py_1
    conda-forge::mmtf-python-1.1.2-py_0
    conda-forge::more-itertools-4.3.0-py27_1000
    conda-forge::pathlib2-2.3.3-py27_1000
    conda-forge::pillow-5.4.1-py27h00a061d_1000
    bioconda::pybigwig-0.3.12-py27hdfb72b2_0
    conda-forge::python-dateutil-2.8.0-py_0
    conda-forge::scipy-0.19.0-np113py27_blas_openblas_202
    conda-forge::setuptools-40.8.0-py27_0
    conda-forge::singledispatch-3.4.0.3-py27_1000
    conda-forge::sparqlwrapper-1.8.2-py27_1000
    conda-forge::backports.functools_lru_cache-1.5-py_1
    conda-forge::bcrypt-3.1.4-py27h470a237_0
    conda-forge::cryptography-2.5-py27hb7f436b_1
    conda-forge::jinja2-2.10-py_1
    conda-forge::nose-1.3.7-py27_1002
    conda-forge::pandas-0.24.1-py27hf484d3e_0
    conda-forge::patsy-0.5.1-py_0
    bioconda::pynacl-0.3.0-py27_0
    bioconda::pysam-0.15.2-py27h1671916_1
    conda-forge::pytest-4.3.0-py27_0
    conda-forge::rdflib-4.2.2-py27_1000
    conda-forge::reportlab-3.5.13-py27hbd3ef63_1000
    conda-forge::scikit-learn-0.19.1-py27_blas_openblas_200
    conda-forge::tornado-5.1.1-py27h14c3975_1000
    conda-forge::wheel-0.33.1-py27_0
    bioconda::biopython-1.68-py27_0
    conda-forge::matplotlib-base-2.2.3-py27h60b886d_1
    conda-forge::paramiko-2.1.2-py27_0
    conda-forge::pip-19.0.2-py27_0
    bioconda::pybedtools-0.8.0-py27he860b03_1
    conda-forge::pyopenssl-19.0.0-py27_0
    defaults::r-base-3.2.2-0
    conda-forge::soupsieve-1.8-py27_0
    conda-forge::statsmodels-0.9.0-py27h3010b51_1000
    conda-forge::beautifulsoup4-4.7.1-py27_1001
    conda-forge::urllib3-1.24.1-py27_1000
    conda-forge::pyqt-5.6.0-py27h13b7fb3_1008
    conda-forge::requests-2.21.0-py27_1000
    bioconda::biothings_client-0.2.0-py_0
    bioconda::intermine-1.11.0-py_0
    conda-forge::matplotlib-2.2.3-py27h8a2030e_1
    bioconda::mygene-3.1.0-py_0
    conda-forge::seaborn-0.9.0-py_0
    defaults::r-3.2.2-0
    defaults::r-boot-1.3_17-r3.2.2_0a
    defaults::r-cluster-2.0.3-r3.2.2_0a
    defaults::r-codetools-0.2_14-r3.2.2_0a
    defaults::r-foreign-0.8_66-r3.2.2_0a
    defaults::r-kernsmooth-2.23_15-r3.2.2_0a
    defaults::r-lattice-0.20_33-r3.2.2_0a
    defaults::r-mass-7.3_45-r3.2.2_0a
    defaults::r-nnet-7.3_11-r3.2.2_0a
    defaults::r-rpart-4.1_10-r3.2.2_0a
    defaults::r-spatial-7.3_11-r3.2.2_0a
    defaults::r-survival-2.38_3-r3.2.2_0a
    bioconda::rpy2-2.7.8-py27r3.2.2_1
    defaults::r-class-7.3_14-r3.2.2_0a
    defaults::r-matrix-1.2_2-r3.2.2_0a
    defaults::r-nlme-3.1_122-r3.2.2_0a
    defaults::r-mgcv-1.8_9-r3.2.2_0a
    defaults::r-recommended-3.2.2-r3.2.2_0
Acribbs commented 5 years ago

Hi @sebastian-luna-valero

Thanks for your help. Im not sure if the code has been tested for scipy >=1.0, but when im back in the office I can test. Can you remember what the issues were?

I really didn't like the idea of unpinning environments in the first place (I think @Charlie-George also agrees with me) because conda does some weird unexpected things when installing very large environments. I haven't tested the devel script yet (its on my list of things to do) but I can have a look when I get back.

sebastian-luna-valero commented 5 years ago

Thanks @Acribbs

I don't remember exactly the issues with scipy >= 1.0 but that could be one of the reasons why the output in https://travis-ci.org/cgat-developers/cgat-apps does not match the reference results.

I see why unpinning all dependencies in cgat-core make sense since we want it to be as stand-alone as possible. However, it might still be appropriate to pin cgat-apps and cgat-flow dependencies. I will wait for @AndreasHeger feedback and then co-operate on what we agree upon.

Best regards, Sebastian

AndreasHeger commented 5 years ago

Thanks, @Acribbs and @sebastian-luna-valero . Generally I think we should unpin. I think there are three reasons:

  1. As part of CI we want to find out if our latest code works with the latest released version of 3rd party software. The pinning comes later - if all tests pass, you pin all the dependencies so you can deploy exactly the software versions you tested.

  2. If you pin everything, your software becomes stale very quickly. Especially in science there is generally the requirement to using the latest software with the best models and bugfixes (I am not talking about reproducibility here). Would you pick up a workflow that makes you use a version of tool XYZ that is 2 years old even if newer releases are available? At the very least, a reviewer would require you to validate your results with the latest version as soon as you talk about power/sensitivity/etc.

  3. If you pin everything you make your software a silo that users can not integrate into their own environments. That is why cgat-core and cgat-apps definitely should not be pinned.

Sometimes there is a need to pin, for example to work around bugs in particular releases of software. But then the pinning should be based on exclusion, i.e., tool!=1.2, instead of tool=1.1

I think it is worth taking a step back to see what the issue is, namely: we can't install our cgat-flow software environment on Travis for testing because of a 30 minute time-out. This could be

  1. because the Travis time-out is too low for cgat-flow,
  2. because conda is too slow,
  3. because our install is bloated.

About these:

  1. Travis is a free service and if we need more time for our tests we need to test elsewhere.
  2. There are no alternatives to conda if you want to work in 'user-space' only. Also, conda is the tool that the cgat-flow audience uses. But we all know that conda - at the moment - does not scale very well.
  3. We have cleaned up a lot of unnecessary dependencies, but it is in the nature of cgat-flow to be dependency heavy. Maybe we just have to accept that installation will take a while.

To conclude, I am thinking that pinning to make sure the install finishes in a pre-defined amount of time just so that it runs on travis is not the right path. Imagine we pin and we get it to 29 mins. What if we need to add another test? Or another dependency?

Thus, I am thinking we should (A) keep everything unpinned and (B) forget about Travis for cgat-flow and rather use our jenkins setup.

That being said, there might be optimizations in the install script. For example, we update the environment multiple times - this could probably be stream-lined.

Acribbs commented 5 years ago

Hi @sebastian-luna-valero ,

Looks like scipy is giving problems, maybe we use Andreas suggestion and use scipy != 1.0.

@AndreasHeger, you have convinced me that its a good idea to unpin then. However, I am a bit worried about the fact that sebastian found that python and R versions were very old. Should we not at least pin these two packages?

I think the primary reason for the build taking so long is the Conda SAT solver (which conda are trying to fix - https://github.com/conda/conda/issues/7700) and the fact that we have quite heavy dependancies. What are your opinions on using cycleCI for cgat-flow instead of Travis?, as you seem (from what I can see at the moment) to be limited by 1000 hours per month rather than per job. Also, for flow we probably only need to test for linux as the pipelines are too heavy weight for OSX.

The jenkins setup won't integrate with GitHub though will it?

sebastian-luna-valero commented 5 years ago

Hi,

I am looking at a potential fix to the installation issue in https://github.com/cgat-developers/cgat-flow/pull/89

Currently the Linux build works but the OS X doesn't. Are we now aiming at supporting OS X builds for cgat-flow as well? That's a real challenge!

Although I am happy to follow @AndreasHeger's approach, I would also like to explain my point of view regarding the pinned vs. unpinned environment.

I agree that the good thing about having an unpinned environment is that you will install by default the latest versions of all the dependencies in your environment. However, that also has a cost:

From my past experience, the time and effort required to solve those issues is not negligible.

On the other hand, having pinned environments provides predictable installations. Regarding the issues mentioned before:

The underlying issue is that none of us has a full-time position to maintain code in a way that installs without issues and is able to keep up with the latest available dependencies out there. We just use a best endeavour approach.

Regarding the Travis vs. Jenkins discussion:

Again, this is only my personal opinion but I am happy to proceed with the unpinned approach.

Best regards, Sebastian

sebastian-luna-valero commented 5 years ago

Conda solver slowdown FAQ and recommendations: https://github.com/bioconda/bioconda-recipes/issues/13774

Charlie-George commented 5 years ago

Thanks @sebastian-luna-valerohttps://github.com/sebastian-luna-valero, thats really helpful!

One thing that I'd find really helpful for solving conda issues is being able to access the most recent version of the enviroment that worked correctly and all tests passed, is there an easy way to track this via travis?


From: Sebastian Luna-Valero notifications@github.com Sent: 25 February 2019 07:29:12 To: cgat-developers/cgat-flow Cc: Charlotte George; Mention Subject: Re: [cgat-developers/cgat-flow] Travis tests failing (#88)

Conda solver slowdown FAQ and recommendations: bioconda/bioconda-recipes#13774https://github.com/bioconda/bioconda-recipes/issues/13774

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cgat-developers/cgat-flow/issues/88#issuecomment-466900501, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AIUc4SWUz8EEg4GetgjCDzVPGvwuke-Nks5vQ5DIgaJpZM4ahybe.

sebastian-luna-valero commented 5 years ago

That used to be the case:

Please note that I have used an old commit for this example, but I think the master branch should always have that functionality by default. The same should happen in our Travis setup.

Best regards, Sebastian

Acribbs commented 5 years ago

Thanks @sebastian-luna-valero for sharing the conda issue. Its really helpful and im going to try out their new solver.

We arnt really expecting to support cgat-flow in OSX so I was wondering if tests should be disabled as the flow is only tested on jenkins using linux build anyway.

I see there are lots of talk of using envronment.yml files and pin packages and thats what I have done for showcase. In the current form I dont think we will be able to get any travis tests passing using the unpinned versions as the solver just takes too long.

I really like the idea that was proposed in the conda issue of using meta channels where only the last 1-2 years of packages as I suspect this would speed up the installation by quite a lot by giving the solver less work.

sebastian-luna-valero commented 5 years ago

Hi,

I have managed to get the travis tests passing for cgat-flow on both Linux and OS X: https://travis-ci.org/cgat-developers/cgat-flow/builds

However, there are a couple of tests failing for cgat-apps: https://github.com/cgat-developers/cgat-apps/issues/33

I would be grateful if someone could have a look before merging.

Best regards, Sebastian

Acribbs commented 5 years ago

Hi @sebastian-luna-valero ,

Thanks so much! So changing the priority increased the resolve time. I think im going to pay closer attention to my channel priority in future.

I have had a look over and it looks good, however I remember discussing with andreas about removing xtrace and if I remember rightly he wanted it in. Although I wasn't keen because the output looks rather messy.

BW, Adam