justinalsing / dlmmc

Dynamical linear modeling (DLM) regression code for analysis of atmospheric time-series data.
MIT License
23 stars 4 forks source link

Linux installation instruction incomplete #6

Closed taqtiqa-mark closed 5 years ago

taqtiqa-mark commented 5 years ago

Blocking openjournals/joss-reviews/issues/1157

Likely related to #3 - in that the current instructions appear to assume alot, and those assumptions are undisclosed.

The project readme claims Linux is supported.
No instructions are provided. This assumes Ubuntu 16.04.6 (LTS - Xenial).

Instructions, and terminal commands that worked-for-me(TM) on a DigitialOcean droplet....

You can skip step (1) if you have a base VM installed with Ubuntu 16.04.6 (LTS - Xenial)

  1. Start with a clean installation of Ubuntu 16.04.6, by either:
    1. Following these instructions
    2. Download Ubuntu 16.04.6 VirtualBox image, extract, import and start the VM.
    3. Launch a Ubuntu 16.04.6 VM using DigitalOcean (referal-link) or some such cloud computing service provider. If you do signup to DO please signup using the referal link - its some (small) compensation for preparing these instructions for you ;) NOTE: If you use DigitalOcean (referal-link), the attached ZIP file contains a Vagrantfile setup to launch a DO droplet - see the assumtions list at the start of the Vagrantfile:
      1. Vagrant+Chef+DO: dlmmc-digiticalocean-chef_zero.zip HTH?
  2. Start the VM, and connect to the VM terminal - or launch a terminal if using the Linux Desktop interface.
  3. Execute the following:
    root@default:~# sudo apt -y install apt-transport-https ca-certificates curl software-properties-common
    root@default:~# sudo apt -y install python3=3.5.1-3
    root@default:~# sudo apt -y install python3-pip=8.1.1-2ubuntu0.4
    root@default:~# pip3 install --upgrade pip
    root@default:~# sudo apt -y install python3-dev=3.5.1-3
    root@default:~# sudo adduser dlmmc

    Now, switch to use the newly created user account dlmmc:

    root@default:~# su - dlmmc
    dlmmc@default:~$ pip3 install numpy scipy matplotlib netCDF4 pystan ipython[all]
    dlmmc@default:~$ git clone https://github.com/justinalsing/dlmmc.git
    dlmmc@default:~$ pushd dlmmc
    dlmmc@default:~$ python3 compile_stan_models.py >compile_stan_models.log 2>&1
    dlmmc@default:~$ jupyter-nbconvert --to notebook --execute --ExecutePreprocessor.timeout=100000 dlm_tutorial.ipynb >dlm_tutorial.log 2>&1

Given you are not using std test suites, please also add, to the installation validation instructions, what a successful run looks like.

For the above instructions:

Gradient evaluation took 0.023274 seconds 1000 transitions using 10 leapfrog steps per transition would take 232.74 seconds. Adjust your expectations accordingly!

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Exception: multiply: B[12] is -nan, but must not be nan! (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Exception: multiply: B[12] is -nan, but must not be nan! (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 1 / 3000 [ 0%] (Warmup) Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Exception: multiply: B[12] is -nan, but must not be nan! (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue: Exception: multiply: B[12] is -nan, but must not be nan! (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration: 300 / 3000 [ 10%] (Warmup) Iteration: 600 / 3000 [ 20%] (Warmup) Iteration: 900 / 3000 [ 30%] (Warmup) Iteration: 1001 / 3000 [ 33%] (Sampling) Iteration: 1300 / 3000 [ 43%] (Sampling) Iteration: 1600 / 3000 [ 53%] (Sampling) Iteration: 1900 / 3000 [ 63%] (Sampling) Iteration: 2200 / 3000 [ 73%] (Sampling) Iteration: 2500 / 3000 [ 83%] (Sampling) Iteration: 2800 / 3000 [ 93%] (Sampling) Iteration: 3000 / 3000 [100%] (Sampling)

Elapsed Time: 267.023 seconds (Warm-up) 420.726 seconds (Sampling) 687.749 seconds (Total)

[NbConvertApp] Writing 688383 bytes to dlm_tutorial.nbconvert.ipynb

taqtiqa-mark commented 5 years ago

If you do signup to DigitalOcean (referal-link) please signup using the referal link - its some (small) compensation for preparing these instructions for you ;)

dlmmc-digiticalocean-chef_zero.zip

justinalsing commented 5 years ago

I'm not sure I agree with this issue - you say that "no instructions are provided" for linux, but from your discussion above you successfully installed the code using exactly the installation instructions currently provided in the README, ie,

pip3 install numpy scipy matplotlib netCDF4 pystan python3 compile_stan_models.py

(although I note you used pip rather than conda, which is less reliable for pystan as has been discussed in issue #3)

So it seems that the installation instructions are working fine; I'm not sure what more needs to be done here?

taqtiqa-mark commented 5 years ago

Hmm,

using exactly the installation instructions currently provided in the README, ie,

I don't believe that is a reasonable characterization. The readme now refers to Conda, which is well and good, but your comment claims pip and, on that flavour of Linux you need pip3, etc otherwise you are using Python 2.x.

You'll also note you omitted ipython[all] which is required to run the example notebook to test the installation is all good - this I gleaned from your comment in the JOSS repository.
However, instructions for running the example (psuedo test suite) is not mentioned in the README.md...

\<Hint> In this issue you have one set OS instructions that are detailed enough to be reproducible, just follow them using equivalent conda (conda3?) commands and packages on a fresh installation, then cut&paste......\</Hint>

justinalsing commented 5 years ago

I think much of this has now been covered in issue #8

However, I have now included an additional file in the repo - INSTALL.md - where I go through the steps of validating the install instructions and showing what a successful installation looks like. Check out INSTALL.md, or alternatively I copy-paste the contents of that file here for convenience:


Installation validation

Here we show what a successful installation looks like, following the installation instructions in the README

For the purposes of validation, let's create and activate virtual environment using conda so we can do a clean install (for testing purposes - if you are trying to install yourself you do not need to do this):

justinalsing$ conda create -n dlmmc python=3.7 anaconda
justinalsing$ conda activate dlmmc

Now let's install the dependencies following the install instructions in the README:

justinalsing$ conda install netCDF4 pystan

Compile the DLM models following instructions in README:

justinalsing$ python3 compile_stan_models.py

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_1769d29906593e8f6fa11e816b642cff NOW.
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_323f0530039bc4ac2c22bb5250e1d6c1 NOW.
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_c3ff00cf2253f51bed2b150f31119693 NOW.
INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_b9cb9e0eb2389c8a6e3078345a6a1dd4 NOW.

Note that you might get some additional deep-copy warnings from pystan during compilation - these are not a problem (so long as compliation completes without errors).

Finally let's execute the dlm_tutorial.ipynb notebook to check everything worked correctly:

justinalsing$ jupyter-nbconvert --to notebook --execute --ExecutePreprocessor.timeout=100000 dlm_tutorial.ipynb
[NbConvertApp] Converting notebook dlm_tutorial.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python3

Gradient evaluation took 0.026314 seconds
1000 transitions using 10 leapfrog steps per transition would take 263.14 seconds.
Adjust your expectations accordingly!

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: multiply: B[12] is nan, but must not be nan!  (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: multiply: B[12] is nan, but must not be nan!  (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration:    1 / 3000 [  0%]  (Warmup)
Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: multiply: B[12] is nan, but must not be nan!  (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Exception: multiply: B[12] is nan, but must not be nan!  (in 'unknown file name' at line 159)

If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Iteration:  300 / 3000 [ 10%]  (Warmup)
Iteration:  600 / 3000 [ 20%]  (Warmup)
Iteration:  900 / 3000 [ 30%]  (Warmup)
Iteration: 1001 / 3000 [ 33%]  (Sampling)
Iteration: 1300 / 3000 [ 43%]  (Sampling)
Iteration: 1600 / 3000 [ 53%]  (Sampling)
Iteration: 1900 / 3000 [ 63%]  (Sampling)
Iteration: 2200 / 3000 [ 73%]  (Sampling)
Iteration: 2500 / 3000 [ 83%]  (Sampling)
Iteration: 2800 / 3000 [ 93%]  (Sampling)
Iteration: 3000 / 3000 [100%]  (Sampling)

 Elapsed Time: 180.043 seconds (Warm-up)
               372.797 seconds (Sampling)
               552.839 seconds (Total)

[NbConvertApp] Writing 687553 bytes to dlm_tutorial.nbconvert.ipynb

The notebook completed without error! You're now ready to start using the code.