ceholden / yatsm

Yet Another Time Series Model
https://yatsm.readthedocs.org/en/latest/
MIT License
63 stars 30 forks source link

GEO: glmnet-python version change #68

Closed valpasq closed 8 years ago

valpasq commented 8 years ago

Was running YATSM for p012r031, everything was going fine until the 67th job, when I suddenly got the following error message:

09:26:06:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_LassoCV" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl)
Traceback (most recent call last):
  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.5', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
    cfg = parse_config_file(config)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
    return convert_config(cfg)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
    cfg[pred_method]['pickle'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
    reg = joblib.load(pickle)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
    obj = unpickler.load()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1090, in load_global
    klass = self.find_class(module, name)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1124, in find_class
    __import__(module)
ImportError: No module named elastic_net_cv

I tried changing to GLMNET_Lasso20, then got this message:

09:47:50:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_Lasso20" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_Lasso20.pkl)
Traceback (most recent call last):
  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.5', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
    cfg = parse_config_file(config)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
    return convert_config(cfg)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
    cfg[pred_method]['pickle'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
    reg = joblib.load(pickle)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
    obj = unpickler.load()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1083, in load_newobj
    obj = cls.__new__(cls, *args)
TypeError: function.__new__(X): X is not a type object (function)

Oddly enough, when I tried the sklearn Lasso20, things seem to be running alright, though I do get a warning about convergence: 10:03:31:DEBUG:66:config_parser.convert_config:Predicting using "Lasso20" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/sklearn_Lasso20.pkl) 10:03:31:DEBUG:93:cache.test_cache:Attempt reading in from cache directory?: True 10:03:31:DEBUG:95:cache.test_cache:Attempt writing to cache directory?: True 10:03:31:INFO:81:line.line:Job 0 of 5 - using config file /projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml 10:03:31:DEBUG:96:line.line:Responsible for lines: [ 0 5 10 ..., 7140 7145 7150] 10:03:31:DEBUG:125:line.line:Already processed line 0 10:03:31:DEBUG:125:line.line:Already processed line 5 10:03:31:DEBUG:125:line.line:Already processed line 10 10:03:31:DEBUG:125:line.line:Already processed line 15 10:03:31:DEBUG:125:line.line:Already processed line 20 10:03:31:DEBUG:125:line.line:Already processed line 25 10:03:31:DEBUG:125:line.line:Already processed line 30 10:03:31:DEBUG:125:line.line:Already processed line 35 10:03:31:DEBUG:125:line.line:Already processed line 40 10:03:31:DEBUG:125:line.line:Already processed line 45 10:03:31:DEBUG:125:line.line:Already processed line 50 10:03:31:DEBUG:125:line.line:Already processed line 55 10:03:31:DEBUG:125:line.line:Already processed line 60 10:03:31:DEBUG:125:line.line:Already processed line 65 10:03:31:DEBUG:128:line.line:Running line 70 10:03:32:DEBUG:158:reader.read_line:Read in Y from cache file /project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.py:444: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations ConvergenceWarning) 10:07:41:DEBUG:192:line.line: Saving YATSM output to /projectnb/landsat/projects/Massachusetts/p012r031/images/YATSM/yatsm_r70.npz 10:07:41:DEBUG:199:line.line:Line 70 took 249.888620138s to run 10:07:41:DEBUG:128:line.line:Running line 75 10:07:42:DEBUG:158:reader.read_line:Read in Y from cache file

Did something change with the pickles? I do find it really strange that my first 60+ jobs ran fine before I started getting errors, which makes me think this is not something to do with my copy of YATSM (since I didn't do a pull or anything that should change those files). Since the last lines of the errors have to do with File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py" I'm wondering if this maybe has something to do with site packages?

Any insight would be much appreciated--was hoping to run all 5 MA scenes this week,

PS - My log files are a mess, but the first few runs are in /projectnb/landsat/projects/Massachusetts/p012r031/images/, more recent runs in /projectnb/landsat/projects/Massachusetts/p012r031.

parevalo commented 8 years ago

I was JUST checking my logs after a training run and got exactly the same second error you got:

TypeError: function.__new__(X): X is not a type object (function) 

I was just about to start attempting to trace the problem. My config and log files:

/projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/859_FIT1.yaml /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/train_859.o1031220

ceholden commented 8 years ago

I changed the installation of glmnet-python in the python/2.7.5 module location because Damien needs the old version and he doesn't quite know how to use Python. Sorry for the lack of notice -- I thought you all would be isolated from it, but I guess all of your virtual environments used the --system-site-packages option so it pointed to the system copy.

Please install glmnet-python into your virtual environments as follows:

pip install git+https://github.com/ceholden/glmnet-python.git

Paging @bullocke as well!

parevalo commented 8 years ago

I still think there's is something wrong with the pickle site package, because I am only running the training script and that doesn't have anything to do with the glmnet-python (as far as I know). Still, I installed the glmnet-python as you instructed and I'm still getting the error:

File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load obj = unpickler.load() File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load dispatchkey File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1083, in load_newobj obj = cls.new(cls, *args) TypeError: function.new(X): X is not a type object (function)

valpasq commented 8 years ago

Okay...just to make sure I'm following, I went to /usr3/graduate/valpasq/venv/ and used the pip install command you provided above. Was that the correct location? I also tried in /usr3/graduate/valpasq/venv/bin/. Nothing seems to have changed in either location. Should I be seeing glmnet-python somewhere?

Also, maybe a dumb question, but does anything need to change in my source file? Current version below:

#!/bin/bash
# Modules    
module load python/2.7.5
module load gdal/1.11.1
module load qgis/2.6.1
module load R/R-3.1.1
# Python virtualenv for `yatsm`
source $HOME/venv/bin/activate
parevalo commented 8 years ago

You just need to activate the venv and then use the pip installation command command. Then you can use pip show to see where it is installed, requirements and additional info.

On Tue, Dec 1, 2015 at 10:56 AM, Valerie Pasquarella < notifications@github.com> wrote:

Okay...just to make sure I'm following, I went to /usr3/graduate/valpasq/venv/ and used the pip install command you provided above. Was that the correct location? I also tried in /usr3/graduate/valpasq/venv/bin/. Nothing seems to have changed in either location. Should I be seeing glmnet-python somewhere?

Also, maybe a dumb question, but does anything need to change in my source file? Current version below:

!/bin/bash

Modules

module load python/2.7.5 module load gdal/1.11.1 module load qgis/2.6.1 module load R/R-3.1.1

Python virtualenv for yatsm

source $HOME/venv/bin/activate

— Reply to this email directly or view it on GitHub https://github.com/ceholden/yatsm/issues/68#issuecomment-161010703.

ceholden commented 8 years ago

@valpasq You don't need to be in any specific location to install a Python package. To check if it's installed, do this in Python:

import glmnet
glmnet.__file__

you should see something in /usr3/graduate/....

@parevalo I have no idea what you mean. Can you provide me more information? Also please make sure you are pointing to the right glmnet by following the instructions I gave Val

parevalo commented 8 years ago

I think you didn't see my entire reply because it got formatted in a very strange way and it collapsed all the information I provided, so here it is again. I am trying to run yatsm train to create the classification cache for a scene. When I try to run the script I get one of the errors that Val described on the first post, even after installing glmnet in my virtual environment as you described:

Traceback (most recent call last):
  File "/projectnb/landsat/users/parevalo/yatsm5_venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.4', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)

My config file and train script: /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/859_FIT1.yaml /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/Train_cache.sh

valpasq commented 8 years ago

I am not getting glmnet from /usr3/graduate/.... Here's what I see:

>>> import glmnet
>>> glmnet.__file__
'/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/glmnet/__init__.pyc'

I am wondering if this had to do with the modules.sh files I use as my source. I know this includes module load python/2.7.5 as well as the activation of the virtual environment. Does loading the module mean I'm loading the site packages as well?

I tried removing module load python/2.7.5 from the source file, thinking this might load python via the /venv/bin/activate/:

# Modules    
#module load python/2.7.5
module load gdal/1.11.1
module load qgis/2.6.1
module load R/R-3.1.1
# Python virtualenv for `yatsm`
source $HOME/venv/bin/activate

But then when I try to do the pip install, I get the following error:

(venv)[valpasq@geo ~]$ pip install git+https://github.com/ceholden/glmnet-python.git
/usr3/graduate/valpasq/venv/bin/python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
(venv)[valpasq@geo ~]$ 

So sorry for the ongoing questions, but I am still really confused on how to access the virtual environment copy of glmnet. Do I need to change my source file (/usr3/graduate/valpasq/modules.sh)? Or am I missing something in the pip install?

ceholden commented 8 years ago

OK so I'm not going to try to explain over the internet the bits of confusion or problem in full detail because it would probably add to more confusion. Suffice to say there are many, many Google results for something like "I hate pip" or "I hate virtualenv" (not that I think that they're bad). Maybe some afternoon we can sit with some beer and I can explain how Python finds your modules...


The problem is with the python/2.7.5 module because it unnecessarily defines PYTHONPATH. Defining this throws the /project/earth/packages/.../site-packages above your virtualenv site-packages, thus ruining our ability to override a package by installing it into our virtualenv. I've submitted a ticket to IT asking them to remove this export since it's unnecessary and ruinous to us.

SOLUTION:

We need to unset PYTHONPATH or clear python/2.7.5 related paths, but we also need it for GDAL on the cluster since it's installed in a different place.

> module purge
> module load python/2.7.5
> unset PYTHONPATH
> module load gdal/1.11.1
> source ~/venv/yatsm/bin/activate
(yatsm) > python -c "import glmnet; print(glmnet.__file__)"
/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/__init__.pyc

@valpasq keep that environment set up script just as is. You need to leave Python in there, but add in unset PYTHONPATH right after it as shown above.

As a quick PSA/FYI for the future, it looks like using ipython over python re-orders your sys.path somehow. I was testing the path order and location of glmnet using ipython, so it was fooling me.

valpasq commented 8 years ago

Ah ha! I've been googling the heck out of virtual environments, went through the process of creating a new environment with virtualenv using the --no-site-packages option, but was still getting the site packages path for glmnet, which seemed off....I turned up this StackOverflow post and was JUST about to ask you about possible confusion in the PYTHONPATH!

Thanks for the specifics on how to fix. I updated my environment set up script as suggested, and it looks like I'm getting glmnet from virtual environment:

[valpasq@geo ~]$ source ~/modules.sh 
(venv)[valpasq@geo ~]$ python
Python 2.7.5 (default, Jun 24 2013, 23:27:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import glmnet
>>> glmnet.__file__
'/usr3/graduate/valpasq/venv/lib/python2.7/site-packages/glmnet/__init__.pyc'
>>> 

Will start up my YATSM runs again, hopefully all is well, will let you know if anything else creeps up.

Thanks again!!! It was a good day to learn about virtual environments :)

ceholden commented 8 years ago

Thanks for the link to that SO thread, Val. My Google-fu wasn't as strong as yours today.

ceholden commented 8 years ago

IT replied and we now have a new module you can load that DOESN'T export PYTHONPATH:

python/2.7.5_nopath

Awaiting confirmation from Paulo before closing.

valpasq commented 8 years ago

I just tried using module load python/2.7.5_nopath and the path for glmnet appears to be correct.

But something is still not right. Still getting the same error that started all of this when I try to run YATSM:

13:39:42:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_LassoCV" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl)
Traceback (most recent call last):
  File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
    load_entry_point('yatsm==0.5.6-beta', 'console_scripts', 'yatsm')()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
    cfg = parse_config_file(config)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
    return convert_config(cfg)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
    cfg[pred_method]['pickle'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
    reg = joblib.load(pickle)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
    obj = unpickler.load()
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
    dispatch[key](self)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1090, in load_global
    klass = self.find_class(module, name)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1124, in find_class
    __import__(module)
ImportError: No module named elastic_net_cv

I did notice I was running yatsm==0.5.6-beta. Tried to roll back to yatsm==0.5.4, still getting the exact same error.

Thoughts?

ceholden commented 8 years ago

The version of YATSM doesn't matter in this issue. I can't reproduce the error using your virtual environment and using your config file (only altered to change the output directory).

Can you try ensuring that there is actually such a module? You can also just try loading the pickle directly in the console:

> module purge
> module load python/2.7.5_nopath
> source /usr3/graduate/valpasq/venv/bin/activate
> python
Python 2.7.5 (default, Jun 24 2013, 23:27:32) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import glmnet; print(glmnet.__file__)
/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/__init__.pyc
>>> glmnet.elastic_net_cv
<module 'glmnet.elastic_net_cv' from '/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/elastic_net_cv.pyc'>
>>> pkl = '/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl'
>>> import sklearn.externals.joblib as jl
>>> jl.load(pkl)
<glmnet.elastic_net_cv.LassoCV object at 0x7f12171e2d50>

This set of commands goes all the way from no Python being loaded, so if it doesn't work I might have no clue what's going on.

When I ran yatsm line against your config with your virtual environment, it passed that point without issue. I did, however, get an error related in the _cyprep module because you have yatsm==0.5.4 installed and that had a bug where it couldn't use a list of min/max values. Upgrade to yatsm==0.5.5 or just pull the latest for the beta of v0.5.6 if you like.

Maybe you had both Python modules loaded?

valpasq commented 8 years ago

Stupid mistake on my part!

I have a shell script for running yatsm line, and I was loading the regular python/2.7.5 module instead of the updated "nopath" version at the start of this script.

Took out the duplicate module loading, problem solved!

Thanks for your patience...I definitely learned a lot about virtual environments and modules today!

parevalo commented 8 years ago

I changed my python to the new "nopath", made sure that all of the required packages were installed in my virtual environment (some were being called from the global env previously) and run the script again, and everything is working nicely now. Thank you Chris.

ceholden commented 8 years ago

Sweet!