Closed valpasq closed 8 years ago
I was JUST checking my logs after a training run and got exactly the same second error you got:
TypeError: function.__new__(X): X is not a type object (function)
I was just about to start attempting to trace the problem. My config and log files:
/projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/859_FIT1.yaml /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/train_859.o1031220
I changed the installation of glmnet-python
in the python/2.7.5
module location because Damien needs the old version and he doesn't quite know how to use Python. Sorry for the lack of notice -- I thought you all would be isolated from it, but I guess all of your virtual environments used the --system-site-packages
option so it pointed to the system copy.
Please install glmnet-python
into your virtual environments as follows:
pip install git+https://github.com/ceholden/glmnet-python.git
Paging @bullocke as well!
I still think there's is something wrong with the pickle site package, because I am only running the training script and that doesn't have anything to do with the glmnet-python (as far as I know). Still, I installed the glmnet-python as you instructed and I'm still getting the error:
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load obj = unpickler.load() File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load dispatchkey File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1083, in load_newobj obj = cls.new(cls, *args) TypeError: function.new(X): X is not a type object (function)
Okay...just to make sure I'm following, I went to /usr3/graduate/valpasq/venv/
and used the pip install command you provided above. Was that the correct location? I also tried in /usr3/graduate/valpasq/venv/bin/
. Nothing seems to have changed in either location. Should I be seeing glmnet-python
somewhere?
Also, maybe a dumb question, but does anything need to change in my source file? Current version below:
#!/bin/bash
# Modules
module load python/2.7.5
module load gdal/1.11.1
module load qgis/2.6.1
module load R/R-3.1.1
# Python virtualenv for `yatsm`
source $HOME/venv/bin/activate
You just need to activate the venv and then use the pip installation
command command. Then you can use pip show
On Tue, Dec 1, 2015 at 10:56 AM, Valerie Pasquarella < notifications@github.com> wrote:
Okay...just to make sure I'm following, I went to /usr3/graduate/valpasq/venv/ and used the pip install command you provided above. Was that the correct location? I also tried in /usr3/graduate/valpasq/venv/bin/. Nothing seems to have changed in either location. Should I be seeing glmnet-python somewhere?
Also, maybe a dumb question, but does anything need to change in my source file? Current version below:
!/bin/bash
Modules
module load python/2.7.5 module load gdal/1.11.1 module load qgis/2.6.1 module load R/R-3.1.1
Python virtualenv for
yatsm
source $HOME/venv/bin/activate
— Reply to this email directly or view it on GitHub https://github.com/ceholden/yatsm/issues/68#issuecomment-161010703.
@valpasq You don't need to be in any specific location to install a Python package. To check if it's installed, do this in Python:
import glmnet
glmnet.__file__
you should see something in /usr3/graduate/...
.
@parevalo I have no idea what you mean. Can you provide me more information? Also please make sure you are pointing to the right glmnet
by following the instructions I gave Val
I think you didn't see my entire reply because it got formatted in a very strange way and it collapsed all the information I provided, so here it is again. I am trying to run yatsm train to create the classification cache for a scene. When I try to run the script I get one of the errors that Val described on the first post, even after installing glmnet in my virtual environment as you described:
Traceback (most recent call last):
File "/projectnb/landsat/users/parevalo/yatsm5_venv/bin/yatsm", line 8, in <module>
load_entry_point('yatsm==0.5.4', 'console_scripts', 'yatsm')()
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
return self.main(*args, **kwargs)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
rv = self.invoke(ctx)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
return callback(*args, **kwargs)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
return f(get_current_context(), *args, **kwargs)
My config file and train script: /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/859_FIT1.yaml /projectnb/landsat/projects/Colombia/images/008059/Results/FIT1/Train_cache.sh
I am not getting glmnet from /usr3/graduate/...
. Here's what I see:
>>> import glmnet
>>> glmnet.__file__
'/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/glmnet/__init__.pyc'
I am wondering if this had to do with the modules.sh
files I use as my source. I know this includes module load python/2.7.5
as well as the activation of the virtual environment. Does loading the module mean I'm loading the site packages as well?
I tried removing module load python/2.7.5
from the source file, thinking this might load python via the /venv/bin/activate/
:
# Modules
#module load python/2.7.5
module load gdal/1.11.1
module load qgis/2.6.1
module load R/R-3.1.1
# Python virtualenv for `yatsm`
source $HOME/venv/bin/activate
But then when I try to do the pip install, I get the following error:
(venv)[valpasq@geo ~]$ pip install git+https://github.com/ceholden/glmnet-python.git
/usr3/graduate/valpasq/venv/bin/python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
(venv)[valpasq@geo ~]$
So sorry for the ongoing questions, but I am still really confused on how to access the virtual environment copy of glmnet. Do I need to change my source file (/usr3/graduate/valpasq/modules.sh
)? Or am I missing something in the pip install?
OK so I'm not going to try to explain over the internet the bits of confusion or problem in full detail because it would probably add to more confusion. Suffice to say there are many, many Google results for something like "I hate pip" or "I hate virtualenv" (not that I think that they're bad). Maybe some afternoon we can sit with some beer and I can explain how Python finds your modules...
The problem is with the python/2.7.5
module because it unnecessarily defines PYTHONPATH
. Defining this throws the /project/earth/packages/.../site-packages
above your virtualenv
site-packages
, thus ruining our ability to override a package by installing it into our virtualenv
. I've submitted a ticket to IT asking them to remove this export since it's unnecessary and ruinous to us.
We need to unset PYTHONPATH or clear python/2.7.5
related paths, but we also need it for GDAL on the cluster since it's installed in a different place.
> module purge
> module load python/2.7.5
> unset PYTHONPATH
> module load gdal/1.11.1
> source ~/venv/yatsm/bin/activate
(yatsm) > python -c "import glmnet; print(glmnet.__file__)"
/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/__init__.pyc
@valpasq keep that environment set up script just as is. You need to leave Python in there, but add in unset PYTHONPATH
right after it as shown above.
As a quick PSA/FYI for the future, it looks like using ipython
over python
re-orders your sys.path
somehow. I was testing the path order and location of glmnet
using ipython
, so it was fooling me.
Ah ha! I've been googling the heck out of virtual environments, went through the process of creating a new environment with virtualenv
using the --no-site-packages
option, but was still getting the site packages path for glmnet, which seemed off....I turned up this StackOverflow post and was JUST about to ask you about possible confusion in the PYTHONPATH!
Thanks for the specifics on how to fix. I updated my environment set up script as suggested, and it looks like I'm getting glmnet from virtual environment:
[valpasq@geo ~]$ source ~/modules.sh
(venv)[valpasq@geo ~]$ python
Python 2.7.5 (default, Jun 24 2013, 23:27:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import glmnet
>>> glmnet.__file__
'/usr3/graduate/valpasq/venv/lib/python2.7/site-packages/glmnet/__init__.pyc'
>>>
Will start up my YATSM runs again, hopefully all is well, will let you know if anything else creeps up.
Thanks again!!! It was a good day to learn about virtual environments :)
Thanks for the link to that SO thread, Val. My Google-fu wasn't as strong as yours today.
IT replied and we now have a new module you can load that DOESN'T export PYTHONPATH:
python/2.7.5_nopath
Awaiting confirmation from Paulo before closing.
I just tried using module load python/2.7.5_nopath
and the path for glmnet appears to be correct.
But something is still not right. Still getting the same error that started all of this when I try to run YATSM:
13:39:42:DEBUG:66:config_parser.convert_config:Predicting using "GLMNET_LassoCV" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl)
Traceback (most recent call last):
File "/usr3/graduate/valpasq/venv/bin/yatsm", line 8, in <module>
load_entry_point('yatsm==0.5.6-beta', 'console_scripts', 'yatsm')()
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 700, in __call__
return self.main(*args, **kwargs)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 680, in main
rv = self.invoke(ctx)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 873, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/core.py", line 508, in invoke
return callback(*args, **kwargs)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/click/decorators.py", line 16, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 50, in line
cfg = parse_config_file(config)
File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 145, in parse_config_file
return convert_config(cfg)
File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 69, in convert_config
cfg[pred_method]['pickle'])
File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/config_parser.py", line 150, in _unpickle_predictor
reg = joblib.load(pickle)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 425, in load
obj = unpickler.load()
File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
ImportError: No module named elastic_net_cv
I did notice I was running yatsm==0.5.6-beta
. Tried to roll back to yatsm==0.5.4
, still getting the exact same error.
Thoughts?
The version of YATSM doesn't matter in this issue. I can't reproduce the error using your virtual environment and using your config file (only altered to change the output directory).
Can you try ensuring that there is actually such a module? You can also just try loading the pickle directly in the console:
> module purge
> module load python/2.7.5_nopath
> source /usr3/graduate/valpasq/venv/bin/activate
> python
Python 2.7.5 (default, Jun 24 2013, 23:27:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import glmnet; print(glmnet.__file__)
/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/__init__.pyc
>>> glmnet.elastic_net_cv
<module 'glmnet.elastic_net_cv' from '/usr3/graduate/ceholden/venvs/yatsm/lib/python2.7/site-packages/glmnet/elastic_net_cv.pyc'>
>>> pkl = '/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/glmnet_LassoCV_n50.pkl'
>>> import sklearn.externals.joblib as jl
>>> jl.load(pkl)
<glmnet.elastic_net_cv.LassoCV object at 0x7f12171e2d50>
This set of commands goes all the way from no Python being loaded, so if it doesn't work I might have no clue what's going on.
When I ran yatsm line
against your config with your virtual environment, it passed that point without issue. I did, however, get an error related in the _cyprep
module because you have yatsm==0.5.4
installed and that had a bug where it couldn't use a list of min/max values. Upgrade to yatsm==0.5.5
or just pull the latest for the beta of v0.5.6
if you like.
Maybe you had both Python modules loaded?
Stupid mistake on my part!
I have a shell script for running yatsm line
, and I was loading the regular python/2.7.5
module instead of the updated "nopath" version at the start of this script.
Took out the duplicate module loading, problem solved!
Thanks for your patience...I definitely learned a lot about virtual environments and modules today!
I changed my python to the new "nopath", made sure that all of the required packages were installed in my virtual environment (some were being called from the global env previously) and run the script again, and everything is working nicely now. Thank you Chris.
Sweet!
Was running YATSM for p012r031, everything was going fine until the 67th job, when I suddenly got the following error message:
I tried changing to
GLMNET_Lasso20
, then got this message:Oddly enough, when I tried the sklearn
Lasso20
, things seem to be running alright, though I do get a warning about convergence: 10:03:31:DEBUG:66:config_parser.convert_config:Predicting using "Lasso20" pickle specified from configuration file (/usr3/graduate/valpasq/Documents/yatsm/yatsm/regression/pickles/sklearn_Lasso20.pkl) 10:03:31:DEBUG:93:cache.test_cache:Attempt reading in from cache directory?: True 10:03:31:DEBUG:95:cache.test_cache:Attempt writing to cache directory?: True 10:03:31:INFO:81:line.line:Job 0 of 5 - using config file /projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml 10:03:31:DEBUG:96:line.line:Responsible for lines: [ 0 5 10 ..., 7140 7145 7150] 10:03:31:DEBUG:125:line.line:Already processed line 0 10:03:31:DEBUG:125:line.line:Already processed line 5 10:03:31:DEBUG:125:line.line:Already processed line 10 10:03:31:DEBUG:125:line.line:Already processed line 15 10:03:31:DEBUG:125:line.line:Already processed line 20 10:03:31:DEBUG:125:line.line:Already processed line 25 10:03:31:DEBUG:125:line.line:Already processed line 30 10:03:31:DEBUG:125:line.line:Already processed line 35 10:03:31:DEBUG:125:line.line:Already processed line 40 10:03:31:DEBUG:125:line.line:Already processed line 45 10:03:31:DEBUG:125:line.line:Already processed line 50 10:03:31:DEBUG:125:line.line:Already processed line 55 10:03:31:DEBUG:125:line.line:Already processed line 60 10:03:31:DEBUG:125:line.line:Already processed line 65 10:03:31:DEBUG:128:line.line:Running line 70 10:03:32:DEBUG:158:reader.read_line:Read in Y from cache file /project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/sklearn/linear_model/coordinate_descent.py:444: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations ConvergenceWarning) 10:07:41:DEBUG:192:line.line: Saving YATSM output to /projectnb/landsat/projects/Massachusetts/p012r031/images/YATSM/yatsm_r70.npz 10:07:41:DEBUG:199:line.line:Line 70 took 249.888620138s to run 10:07:41:DEBUG:128:line.line:Running line 75 10:07:42:DEBUG:158:reader.read_line:Read in Y from cache fileDid something change with the pickles? I do find it really strange that my first 60+ jobs ran fine before I started getting errors, which makes me think this is not something to do with my copy of YATSM (since I didn't do a pull or anything that should change those files). Since the last lines of the errors have to do with
File "/project/earth/packages/Python-2.7.5/lib/python2.7/pickle.py"
I'm wondering if this maybe has something to do with site packages?Any insight would be much appreciated--was hoping to run all 5 MA scenes this week,
PS - My log files are a mess, but the first few runs are in
/projectnb/landsat/projects/Massachusetts/p012r031/images/
, more recent runs in/projectnb/landsat/projects/Massachusetts/p012r031
.