kaz-Anova / StackNet

StackNet is a computational, scalable and analytical Meta modelling framework
MIT License
1.32k stars 344 forks source link

Exception in thread "main" java.lang.reflect.InvocationTargetException #25

Closed ahbon123 closed 7 years ago

ahbon123 commented 7 years ago

I have tried StackNet example with CMD under Windows, following problem happens. @kaz-Anova or someone else could give me tips how to fix it? Thanks a lot.

C:\Users\User>java -jar StackNet.jar train task=classification sparse=false has_head=false model=model train_file=train_iris.csv test_file=test_iris.csv test_target=true params=params.txt verbose=true threads=4 metric=logloss stackdata=false parameter name : task value : classification parameter name : sparse value : false parameter name : has_head value : false parameter name : model value : model parameter name : train_file value : train_iris.csv parameter name : test_file value : test_iris.csv parameter name : test_target value : true parameter name : params value : params.txt parameter name : verbose value : true parameter name : threads value : 4 parameter name : metric value : logloss parameter name : stackdata value : false Completed: 4.04 % Completed: 8.08 % Completed: 12.12 % Completed: 16.16 % Completed: 20.20 % Completed: 24.24 % Completed: 28.28 % Completed: 32.32 % Completed: 36.36 % Completed: 40.40 % Completed: 44.44 % Completed: 48.48 % Completed: 52.53 % Completed: 56.57 % Completed: 60.61 % Completed: 64.65 % Completed: 68.69 % Completed: 72.73 % Completed: 76.77 % Completed: 80.81 % Completed: 84.85 % Completed: 88.89 % Completed: 92.93 % Completed: 96.97 % Loaded File: train_iris.csv Total rows in the file: 99 Total columns in the file: 5 Weighted variable : -1 counts: 0 Int Id variable : -1 str id: -1 counts: 0 Target Variables : 1 values : [0] Actual columns number : 4 Number of Skipped rows : 0 Actual Rows (removing the skipped ones) : 99 Loaded dense train data with 99 and columns 4 loaded data in : 0.100000 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: File params.txt failed to import at bufferreader params.txt (系统找不到指定的文件。) at io.input.StackNet_Configuration(input.java:1650) at stacknetrun.runstacknet.main(runstacknet.java:441)

goldentom42 commented 7 years ago

Hi ahbon123, I commented your issue in https://github.com/kaz-Anova/StackNet/issues/5 I replicate my comment here so that we can follow up in a different issue.

from the stack trace I would assume Stacknet is unable to locate params.txt file. Is the file in the same directory you launch stacknet ?

ahbon123 commented 7 years ago

thank you for helping me @goldentom42 yes, i've created params.txt with in the same directory of iris train and test files:

`RandomForestClassifier bootsrap:false max_tree_size:-1 cut_off_subsample:1.0 feature_subselection:1.0 rounding:6 estimators:100 offset:0.00001 max_depth:6 max_features:0.4 min_leaf:2.0 min_split:5.0 Objective:ENTROPY row_subsample:0.95 seed:1 threads:1 bags:1 verbose:false

XgboostClassifier booster:gbtree num_round:1000 eta:0.005 max_leaves:0 gamma:1. max_depth:5 min_child_weight:1.0 subsample:0.9 colsample_bytree:0.7 colsample_bylevel:1.0 lambda:1.0 alpha:1.0 seed:1 threads:1 bags:1 verbose:false

NaiveBayesClassifier usescale:True Shrinkage:0.1 seed:1 threads:1 verbose:false`

goldentom42 commented 7 years ago

This looks strange I cannot reproduce the error with the same param file ... eventhough there is an error in in the RandomForestClassifier : bootsrap should write bootstrap (just updated my faulty comment in issue #5)

Also I'm wondering, would it be possible to get a plain English error instead of ideograms (or a translation) ? I believe that would help pinpoint the issue.

Finally could you test without XgboostClassifier?

I must admit I'm a bit puzzled by this problem... Thanks.

ahbon123 commented 7 years ago

well, i have changed typo errors, same bugs.

Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: File params.txt failed to import at bufferreader params.txt (系统找不到指定的文件。) at io.input.StackNet_Configuration(input.java:1650) at stacknetrun.runstacknet.main(runstacknet.java:441) ... 5 more

kaz-Anova commented 7 years ago

@ahbon123 can you send me your params file and and a link to iris.csv (or the file itself) to kazanovassoftware@gmail.com. There might be some illegal (to encoding) character inside the params.txt. I will try to check that if you send me the files.

ahbon123 commented 7 years ago

Thanks a lot @kaz-Anova . i've sent you an email, please check.

kaz-Anova commented 7 years ago

Ir runs with no problem on me... iris_logloss

  1. Can you tell me your java version?
  2. Ensure you have the latest StackNet.jar from the repository
  3. try to remove xgboost and rerun?
ahbon123 commented 7 years ago

Iris example works well and thanks. I'm trying a new example, and model fitting error happened.

C:\Users\User\StackNet>java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2_params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3 parameter name : task value : regression parameter name : sparse value : true parameter name : has_head value : false parameter name : output_name value : datasettwo parameter name : model value : model2 parameter name : pred_file value : pred2.csv parameter name : train_file value : dataset2_train.txt parameter name : test_file value : dataset2_test.txt parameter name : test_target value : false parameter name : params value : dataset2_params.txt parameter name : verbose value : true parameter name : threads value : 1 parameter name : metric value : mae parameter name : stackdata value : false parameter name : seed value : 1 parameter name : folds value : 4 parameter name : bins value : 3 [4793209, 88528] Loaded File: dataset2_train.txt Total rows in the file: 88528 Total columns in the file: undetrmined-Sparse Number of elements : 4793209 The filedataset2_train.txt was loaded successfully with : Rows : 88528 Columns (excluding target) : 1 Delimeter was : Loaded sparse train data with 88528 and columns 58 loaded data in : 7.199000 Binning parameters [-0.0131, <=-0.0131] [0.0247, <=0.0247] [0.4187, <=0.4187] Level: 1 dimensionality: 12 Starting cross validation Fitting model : 0 mae : 0.05345934807415479 Fitting model : 1 mae : 0.0531896123303922 Fitting model : 2 mae : 0.053078546424724905 Fitting model : 3 mae : 0.053564587280493355 Fitting model : 4 mae : 0.05400600373147023 Fitting model : 5 mae : 0.05392384110435197 Fitting model : 6 mae : 0.05410711789920872 Fitting model : 7 mae : 0.054059103067013434 Fitting model : 8 mae : 0.05390186942529881 Fitting model : 9 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: Tree is not fitted at ml.Bagging.scoringhelperbagv2.<init>(scoringhelperbagv2.java:109) at ml.Bagging.BaggingRegressor.predict2d(BaggingRegressor.java:669) at ml.Bagging.BaggingRegressor.predict_proba(BaggingRegressor.java:1875) at ml.stacknet.StackNetRegressor.fit(StackNetRegressor.java:3065) at stacknetrun.runstacknet.main(runstacknet.java:522) ... 5 more

kaz-Anova commented 7 years ago

what is your 10th model in the parameter file, can you paste it here? add verbose:true to that model to see what cause the failure.

ahbon123 commented 7 years ago

Here it is. Thanks! dataset2_params.txt

kaz-Anova commented 7 years ago

You need to install python and python needs to be found when you press python in the command line - see here . Sklearn needs to be of version 0.18 or higher.

If you dont have python on path - you could also install anaconda (python 2.7) from here and make certain you select to add python to PATH when the menu appears.

ahbon123 commented 7 years ago

again, thanks for your help! i install python3 in Stacknet\lib\python, scipy and sklearn is already installed in Anaconda3's lib and i think maybe there is a problem with 9th model in params.txt for my computer, does that make sense?

`C:\Users\User\StackNet\lib\python>python -V Python 3.6.2

C:\Users\User\StackNet\lib\python>pip install scipy Requirement already satisfied: scipy in d:\anaconda3\lib\site-packages

C:\Users\User\StackNet\lib\python>pip install sklearn Requirement already satisfied: sklearn in d:\anaconda3\lib\site-packages Requirement already satisfied: scikit-learn in d:\anaconda3\lib\site-packages (from sklearn)

C:\Users\User\StackNet\lib\python>python Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.

`

verbose:true, results C:\Users\User\StackNet>java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2_params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3 parameter name : task value : regression parameter name : sparse value : true parameter name : has_head value : false parameter name : output_name value : datasettwo parameter name : model value : model2 parameter name : pred_file value : pred2.csv parameter name : train_file value : dataset2_train.txt parameter name : test_file value : dataset2_test.txt parameter name : test_target value : false parameter name : params value : dataset2_params.txt parameter name : verbose value : true parameter name : threads value : 1 parameter name : metric value : mae parameter name : stackdata value : false parameter name : seed value : 1 parameter name : folds value : 4 parameter name : bins value : 3 [4793209, 88528] Loaded File: dataset2_train.txt Total rows in the file: 88528 Total columns in the file: undetrmined-Sparse Number of elements : 4793209 The filedataset2_train.txt was loaded successfully with : Rows : 88528 Columns (excluding target) : 1 Delimeter was : Loaded sparse train data with 88528 and columns 58 loaded data in : 7.033000 Binning parameters [-0.0131, <=-0.0131] [0.0247, <=0.0247] [0.4187, <=0.4187] Level: 1 dimensionality: 12 Starting cross validation Fitting model : 0 mae : 0.05345934807415479 Fitting model : 1 mae : 0.0531896123303922 Fitting model : 2 mae : 0.053078546424724905 Fitting model : 3 mae : 0.053564587280493355 Fitting model : 4 mae : 0.05400600373147023 Fitting model : 5 mae : 0.05393126411126578 Fitting model : 6 mae : 0.05410711789920872 Fitting model : 7 mae : 0.054059103067013434 Fitting model : 8 iteration: 1 iteration: 2 iteration: 3 iteration: 4 iteration: 5 iteration: 6 iteration: 7 iteration: 8 iteration: 9 iteration: 10 iteration: 11 iteration: 12 iteration: 13 iteration: 14 iteration: 15 iteration: 16 iteration: 17 iteration: 18 iteration: 19 iteration: 20 iteration: 21 iteration: 22 iteration: 23 iteration: 24 iteration: 25 iteration: 26 iteration: 27 iteration: 28 iteration: 29 iteration: 30 iteration: 31 iteration: 32 iteration: 33 iteration: 34 iteration: 35 iteration: 36 iteration: 37 iteration: 38 iteration: 39 iteration: 40 iteration: 41 iteration: 42 iteration: 43 iteration: 44 iteration: 45 iteration: 46 iteration: 47 iteration: 48 iteration: 49 iteration: 50 mae : 0.05390186942529881 Fitting model : 9 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: Tree is not fitted at ml.Bagging.scoringhelperbagv2.<init>(scoringhelperbagv2.java:109) at ml.Bagging.BaggingRegressor.predict2d(BaggingRegressor.java:669) at ml.Bagging.BaggingRegressor.predict_proba(BaggingRegressor.java:1875) at ml.stacknet.StackNetRegressor.fit(StackNetRegressor.java:3065) at stacknetrun.runstacknet.main(runstacknet.java:522) ... 5 more

verbose:true, result C:\Users\User\StackNet>java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2_params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3 parameter name : task value : regression parameter name : sparse value : true parameter name : has_head value : false parameter name : output_name value : datasettwo parameter name : model value : model2 parameter name : pred_file value : pred2.csv parameter name : train_file value : dataset2_train.txt parameter name : test_file value : dataset2_test.txt parameter name : test_target value : false parameter name : params value : dataset2_params.txt parameter name : verbose value : true parameter name : threads value : 1 parameter name : metric value : mae parameter name : stackdata value : false parameter name : seed value : 1 parameter name : folds value : 4 parameter name : bins value : 3 [4793209, 88528] Loaded File: dataset2_train.txt Total rows in the file: 88528 Total columns in the file: undetrmined-Sparse Number of elements : 4793209 The filedataset2_train.txt was loaded successfully with : Rows : 88528 Columns (excluding target) : 1 Delimeter was : Loaded sparse train data with 88528 and columns 58 loaded data in : 7.086000 Binning parameters [-0.0131, <=-0.0131] [0.0247, <=0.0247] [0.4187, <=0.4187] Level: 1 dimensionality: 12 Starting cross validation Fitting model : 0 mae : 0.05345934807415479 Fitting model : 1 mae : 0.0531896123303922 Fitting model : 2 mae : 0.053078546424724905 Fitting model : 3 mae : 0.053564587280493355 Fitting model : 4 mae : 0.05400600373147023 Fitting model : 5 mae : 0.05394752263547967 Fitting model : 6 mae : 0.05410711789920872 Fitting model : 7 mae : 0.054059103067013434 Fitting model : 8 mae : 0.05390186942529881 Fitting model : 9 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: Tree is not fitted at ml.Bagging.scoringhelperbagv2.(scoringhelperbagv2.java:109) at ml.Bagging.BaggingRegressor.predict2d(BaggingRegressor.java:669) at ml.Bagging.BaggingRegressor.predict_proba(BaggingRegressor.java:1875) at ml.stacknet.StackNetRegressor.fit(StackNetRegressor.java:3065) at stacknetrun.runstacknet.main(runstacknet.java:522) ... 5 more

kaz-Anova commented 7 years ago

You dont need to install python inside lib.

Can you open a terminal in a random location and press this?:

python , then ENTER and tell me if it is recognised (or what it prints) . if it is not, you need to make certain Python is in PATH . PATH is an environmental variable.

You may find some of the threads here useful.

ahbon123 commented 7 years ago

as i change 10th model verbose:true, here is the result:

C:\Users\User\StackNet>java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2_params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3 parameter name : task value : regression parameter name : sparse value : true parameter name : has_head value : false parameter name : output_name value : datasettwo parameter name : model value : model2 parameter name : pred_file value : pred2.csv parameter name : train_file value : dataset2_train.txt parameter name : test_file value : dataset2_test.txt parameter name : test_target value : false parameter name : params value : dataset2_params.txt parameter name : verbose value : true parameter name : threads value : 1 parameter name : metric value : mae parameter name : stackdata value : false parameter name : seed value : 1 parameter name : folds value : 4 parameter name : bins value : 3 [4793209, 88528] Loaded File: dataset2_train.txt Total rows in the file: 88528 Total columns in the file: undetrmined-Sparse Number of elements : 4793209 The filedataset2_train.txt was loaded successfully with : Rows : 88528 Columns (excluding target) : 1 Delimeter was : Loaded sparse train data with 88528 and columns 58 loaded data in : 7.333000 Binning parameters [-0.0131, <=-0.0131] [0.0247, <=0.0247] [0.4187, <=0.4187] Level: 1 dimensionality: 12 Starting cross validation Fitting model : 0 mae : 0.05345934807415479 Fitting model : 1 mae : 0.0531896123303922 Fitting model : 2 mae : 0.053078546424724905 Fitting model : 3 mae : 0.053564587280493355 Fitting model : 4 mae : 0.05400600373147023 Fitting model : 5 mae : 0.053925048983178896 Fitting model : 6 mae : 0.05410711789920872 Fitting model : 7 mae : 0.054059103067013434 Fitting model : 8 mae : 0.05390186942529881 Fitting model : 9 Traceback (most recent call last): File "lib\python\SklearnRandomForestRegressor.py", line 50, in <module> from sklearn.ensemble import RandomForestRegressor ImportError: No module named sklearn.ensemble Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: Tree is not fitted at ml.Bagging.scoringhelperbagv2.<init>(scoringhelperbagv2.java:109) at ml.Bagging.BaggingRegressor.predict2d(BaggingRegressor.java:669) at ml.Bagging.BaggingRegressor.predict_proba(BaggingRegressor.java:1875) at ml.stacknet.StackNetRegressor.fit(StackNetRegressor.java:3065) at stacknetrun.runstacknet.main(runstacknet.java:522) ... 5 more

kaz-Anova commented 7 years ago

what version of sklearn do you have?

do this in a terminal:

  1. type python and press ENTER
  2. type import sklearn and press ENTER
  3. type print(sklearn.__version__)

Tell me the results

ahbon123 commented 7 years ago

there are no module under my Python2.7 (Anaconda3 yes), then i try to install Sklearn for Python2.7, Numpy was successfully installed but Scipy and Sklearn were not the case. Here is what happened. Looks like very complicated or maybe i was in a wrong direction.

`C:\Python27\Scripts>python

Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import sklearn Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\sklearn__init__.py", line 134, in from .base import clone File "C:\Python27\lib\site-packages\sklearn\base.py", line 10, in from scipy import sparse ImportError: No module named scipy

import numpy

import scipy Traceback (most recent call last): File "", line 1, in ImportError: No module named scipy

import sklearn Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\site-packages\sklearn__init.py", line 133, in from . import check_build ImportError: cannot import name __check_build `

i try to install scipy: `C:\Python27\Scripts>pip install scipy Collecting scipy Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x0000000004CE0BE0>, 'Connection to pypi.python.org timed out. (connect timeout=15)')': /simple/scipy/ Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x0000000004CE09E8>, 'Connection to pypi.python.org timed out. (connect timeout=15)')': /simple/scipy/ Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x0000000004CE0CC0>, 'Connection to pypi.python.org timed out. (connect timeout=15)')': /simple/scipy/ Using cached scipy-0.19.1.tar.gz Installing collected packages: scipy Running setup.py install for scipy ... error Complete output from command c:\python27\python.exe -u -c "import setuptools, tokenize;file='c:\users\zhaod\appdata\local\temp\pip-build-plklmp\scipy\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record c:\users\zhaod\appdata\local\temp\pip-kgznzp-record\install-record.txt --single-version-externally-managed --compile:

Note: if you need reliable uninstall behavior, then install
with pip instead of using `setup.py install`:

  - `pip install .`       (from a git repo or downloaded source
                           release)
  - `pip install scipy`   (last SciPy release on PyPI)

lapack_opt_info:
lapack_mkl_info:
  libraries mkl_rt not found in ['c:\\python27\\lib', 'C:\\', 'c:\\python27\\libs']
  NOT AVAILABLE

openblas_lapack_info:
  libraries openblas not found in ['c:\\python27\\lib', 'C:\\', 'c:\\python27\\libs']
  NOT AVAILABLE

atlas_3_10_threads_info:
Setting PTATLAS=ATLAS
c:\python27\lib\site-packages\numpy\distutils\system_info.py:1051: UserWarning: Specified path C:\projects\numpy-wheels\windows-wheel-builder\atlas-builds\atlas-3.11.38-sse2-64\lib is invalid.
  pre_dirs = system_info.get_paths(self, section, key)
<class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
  NOT AVAILABLE

atlas_3_10_info:
<class 'numpy.distutils.system_info.atlas_3_10_info'>
  NOT AVAILABLE

atlas_threads_info:
Setting PTATLAS=ATLAS
<class 'numpy.distutils.system_info.atlas_threads_info'>
  NOT AVAILABLE

atlas_info:
<class 'numpy.distutils.system_info.atlas_info'>
  NOT AVAILABLE

c:\python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning:
    Atlas (http://math-atlas.sourceforge.net/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [atlas]) or by setting
    the ATLAS environment variable.
  self.calc_info()
lapack_info:
  libraries lapack not found in ['c:\\python27\\lib', 'C:\\', 'c:\\python27\\libs']
  NOT AVAILABLE

c:\python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning:
    Lapack (http://www.netlib.org/lapack/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [lapack]) or by setting
    the LAPACK environment variable.
  self.calc_info()
lapack_src_info:
  NOT AVAILABLE

c:\python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning:
    Lapack (http://www.netlib.org/lapack/) sources not found.
    Directories to search for the sources can be specified in the
    numpy/distutils/site.cfg file (section [lapack_src]) or by setting
    the LAPACK_SRC environment variable.
  self.calc_info()
  NOT AVAILABLE

Running from scipy source directory.
non-existing path in 'scipy\\integrate': 'quadpack.h'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\users\User\appdata\local\temp\pip-build-plklmp\scipy\setup.py", line 416, in <module>
    setup_package()
  File "c:\users\User\appdata\local\temp\pip-build-plklmp\scipy\setup.py", line 412, in setup_package
    setup(**metadata)
  File "c:\python27\lib\site-packages\numpy\distutils\core.py", line 135, in setup
    config = configuration()
  File "c:\users\User\appdata\local\temp\pip-build-plklmp\scipy\setup.py", line 336, in configuration
    config.add_subpackage('scipy')
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 1029, in add_subpackage
    caller_level = 2)
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 998, in get_subpackage
    caller_level = caller_level + 1)
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 935, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy\setup.py", line 15, in configuration
    config.add_subpackage('linalg')
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 1029, in add_subpackage
    caller_level = 2)
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 998, in get_subpackage
    caller_level = caller_level + 1)
  File "c:\python27\lib\site-packages\numpy\distutils\misc_util.py", line 935, in _get_configuration_from_setup_py
    config = setup_module.configuration(*args)
  File "scipy\linalg\setup.py", line 20, in configuration
    raise NotFoundError('no lapack/blas resources found')
numpy.distutils.system_info.NotFoundError: no lapack/blas resources found

----------------------------------------

Command "c:\python27\python.exe -u -c "import setuptools, tokenize;file='c:\users\zhaod\appdata\local\temp\pip-build-plklmp\scipy\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record c:\users\User\appdata\local\temp\pip-kgznzp-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in c:\users\User\appdata\local\temp\pip-build-plklmp\scipy`

when i download scipy and try to install it `C:\Users\User\Desktop\scipy-0.19.1\scipy-0.19.1>python setup.py install

Note: if you need reliable uninstall behavior, then install with pip instead of using setup.py install:

lapack_opt_info: lapack_mkl_info: libraries mkl_rt not found in ['C:\Python27\lib', 'C:\', 'C:\Python27\libs'] NOT AVAILABLE

openblas_lapack_info: libraries openblas not found in ['C:\Python27\lib', 'C:\', 'C:\Python27\libs'] NOT AVAILABLE

atlas_3_10_threads_info: Setting PTATLAS=ATLAS C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1051: UserWarning: Specified path C:\projects\numpy-wheels\windows-wheel-builder\atlas-builds\atlas-3.11.38-sse2-64\lib is invalid. pre_dirs = system_info.get_paths(self, section, key) <class 'numpy.distutils.system_info.atlas_3_10_threads_info'> NOT AVAILABLE

atlas_3_10_info: <class 'numpy.distutils.system_info.atlas_3_10_info'> NOT AVAILABLE

atlas_threads_info: Setting PTATLAS=ATLAS <class 'numpy.distutils.system_info.atlas_threads_info'> NOT AVAILABLE

atlas_info: <class 'numpy.distutils.system_info.atlas_info'> NOT AVAILABLE

C:\Python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. self.calc_info() lapack_info: libraries lapack not found in ['C:\Python27\lib', 'C:\', 'C:\Python27\libs'] NOT AVAILABLE

C:\Python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning: Lapack (http://www.netlib.org/lapack/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [lapack]) or by setting the LAPACK environment variable. self.calc_info() lapack_src_info: NOT AVAILABLE

C:\Python27\lib\site-packages\numpy\distutils\system_info.py:572: UserWarning: Lapack (http://www.netlib.org/lapack/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [lapack_src]) or by setting the LAPACK_SRC environment variable. self.calc_info() NOT AVAILABLE

Running from scipy source directory. non-existing path in 'scipy\integrate': 'quadpack.h' Traceback (most recent call last): File "setup.py", line 416, in setup_package() File "setup.py", line 412, in setup_package setup(*metadata) File "C:\Python27\lib\site-packages\numpy\distutils\core.py", line 135, in setup config = configuration() File "setup.py", line 336, in configuration config.add_subpackage('scipy') File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 1029, in add_subpackage caller_level = 2) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 998, in get_subpackage caller_level = caller_level + 1) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 935, in _get_configuration_from_setup_py config = setup_module.configuration(args) File "scipy\setup.py", line 15, in configuration config.add_subpackage('linalg') File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 1029, in add_subpackage caller_level = 2) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 998, in get_subpackage caller_level = caller_level + 1) File "C:\Python27\lib\site-packages\numpy\distutils\misc_util.py", line 935, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy\linalg\setup.py", line 20, in configuration raise NotFoundError('no lapack/blas resources found') numpy.distutils.system_info.NotFoundError: no lapack/blas resources found`

Using easy_install to install scipy

`C:\Python27\Scripts>easy_install scipy Traceback (most recent call last): File "C:\Python27\Scripts\easy_install-script.py", line 11, in load_entry_point('setuptools==36.4.0', 'console_scripts', 'easy_install')() File "C:\Python27\lib\site-packages\pkg_resources__init.py", line 565, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "C:\Python27\lib\site-packages\pkg_resources__init.py", line 2631, in load_entry_point return ep.load() File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 2291, in load return self.resolve() File "C:\Python27\lib\site-packages\pkg_resources\init.py", line 2297, in resolve module = import__(self.module_name, fromlist=['name__'], level=0) File "build\bdist.win-amd64\egg\setuptools\command\easy_install.py", line 47, in

File "build\bdist.win-amd64\egg\setuptools\sandbox.py", line 15, in ImportError: No module named py31compat

C:\Python27\Scripts>easy_install py31compat Traceback (most recent call last): File "C:\Python27\Scripts\easy_install-script.py", line 11, in load_entry_point('setuptools==36.4.0', 'console_scripts', 'easy_install')() File "C:\Python27\lib\site-packages\pkg_resources__init.py", line 565, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "C:\Python27\lib\site-packages\pkg_resources__init.py", line 2631, in load_entry_point return ep.load() File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 2291, in load return self.resolve() File "C:\Python27\lib\site-packages\pkg_resources\init.py", line 2297, in resolve module = import__(self.module_name, fromlist=['name__'], level=0) File "build\bdist.win-amd64\egg\setuptools\command\easy_install.py", line 47, in

File "build\bdist.win-amd64\egg\setuptools\sandbox.py", line 15, in ImportError: No module named py31compat`

goldentom42 commented 7 years ago

Hi ahbon123, so you have 2 python installations! And it looks like python 2.7 is used by default. How do you know which python distribution is used when you type python in a windows command ? Do you have some sort of aliases or do you manually change the python path ?

The simplest solution (at least for your current stacknet issue) would be to put Anaconda3 python path at the front of your PATH windows variable so that Stacknet uses your python 3.5 distribution and finds sklearn modules.

ahbon123 commented 7 years ago

brilliant! thank you so much! i'm trying with that!

goldentom42 commented 7 years ago

Not sure this is that brilliant as it may cause issues with your python 2.7 distribution... At least this should make your stacknet script run smoothly. I would recommend you to use python environment management tools like virtualenv. This would ensure your python distributions do not conflict.

ahbon123 commented 7 years ago

thanks for your advice and i'll try it. btw, it works well until stop here for more than half hour, is that normal or should i restart it from beginning?

Loaded File: dataset2_test.txt Total rows in the file: 2985217 Total columns in the file: undetrmined-Sparse Number of elements : 161547578 Loaded sparse test data with 2985217 and columns 58 loading test data lasted : 289.819000

kaz-Anova commented 7 years ago

Generally that is OK because it starts scoring the test data (which in this instance is quite big - 3 million rows).

However i have noticed that sometimes it hangs in windows if your screen remains idle for a while and you need to press a button on the terminal (like ENTER) to unblock it . I thought that only my computer does this, but it might be yours too (who knows) - so try to press ENTER in case the same thing has happened ! Otherwise, you just need to w8.

also @goldentom42 thank you for the assistance here :)

ahbon123 commented 7 years ago

Okay but unfortunately I've aborted and restarted program. I'll see what will happen tomorrow morning. Thanks for both of you.

ahbon123 commented 7 years ago

Hi @kaz-Anova, looks like my desktop don't have enough memory (It says "exception in thread "Thread-22" java.lang.OutOfMemoryError: Java heap space), at this case, do you have a idea what should i do except changing another computer, thanks!

goldentom42 commented 7 years ago

Hi ahbon123,

How much memory do you have on your PC ?

Usually java maximum heap size is a 4th of your overall memory. You can check your default settings on windows with : java -XX:+PrintFlagsFinal -version | findstr /i "HeapSize PermSize ThreadStackSize" (taken from this discussion on stackoverflow)

You can change the max heap size with the -Xmx command: java -Xmx8096m -jar StackNet.jar ... which basically mean you set the heap size to 8096 meg (m is for megabyte)

There's an example here

Hope this helps.

kaz-Anova commented 7 years ago

as an additional suggestion, try to split train and predict into 2 tasks. First do the train and then the predict (test) . Also set all threads:1. Consider removing the FastRGF as it was occupying much memory.

ahbon123 commented 7 years ago

ok, thanks a lot @kaz-Anova @goldentom42. i'm going to try with your method. btw, RAM of my desktop is 8 gb.

ahbon123 commented 7 years ago

Hi, @kaz-Anova @goldentom42, may i ask you how do you choose models for each layer, should we choose them by iterative try? is it necessary models should be diversified?

goldentom42 commented 7 years ago

Hi ahbon123, I believe 2 layers are enough if you do not use restacking. The 1st layer is about creating as diverse meta features as possible, so yes you need diverse models in terms of features and in terms of models (trees, linear, NN...). Models in the first layer should not overfit too much. My experience shows that if they do the 2nd layer will focus on certain models predictions and won't use others. This usually leads to big overfitting. This is my take on it but I have a limited experience... @kaz-Anova is much more experienced and skilled ;-)

ahbon123 commented 7 years ago

Thanks! @goldentom42 looks like it's not the case for kaz-Anova's example because he uses 4 lightgbm and 2 xgb. Additionally there are no CATboost in algorithms pool, otherwise we may add it to see what will happen.

ahbon123 commented 7 years ago

Hello @kaz-Anova @goldentom42 sorry to bother you but i've met new bugs. In general it says cant find blablabla.pred files. May i ask you how to deal with it? thanks. default

goldentom42 commented 7 years ago

As far as kaz-Anova's example is concerned the 4 lightgbms have different parameters particularly objectives and min_data_in_leaf (the last mean different level of regularization), which is a sufficient level of diversity. For the second problem I am not too sure about the NegativeArraySizeException...

goldentom42 commented 7 years ago

Sorry I did not see the number of elements in the 161547578 is way too big. There must be something wrong in your param file or in the pred file...

ahbon123 commented 7 years ago

Ok, get it! Thanks @goldentom42. i dont know why the number of elements is so big, i'll restart again from the beginning.

ahbon123 commented 7 years ago

Hello @goldentom42 @kaz-Anova Have you met this issue before? h2o has been installed in Anaconda3. Thanks. bug1

kaz-Anova commented 7 years ago

hey @ahbon123 as it says. For tweedie regression , you cannot have negative values in your target variable. In other words, you cannot use this for Zillow unless you change your target to have only positive values. You could try other objective values like auto or gaussian .

see this .

ahbon123 commented 7 years ago

I'm glad the hero in data science arena is back. Thanks @kaz-Anova for creating StackNet for data players. I try to use almost all regressor algorithms in StackNet to create a multiply layers model, do you have any suggestions regarding my approach? Or maybe I should choose several models based on AUCs? According my experience, it's really time consuming to run the code if there are too much models for multiply layers. Sometimes it just stagnates on fitting models, sometimes it report errors after 1 or 2 hours.

kaz-Anova commented 7 years ago

that is why you need to run only one model at a time to ensure it runs properly (and monitor cv performance)

This is my advice. As for the model selection, the more the better :) . Your approach is similar to what I do too!

Also , since this discussion has now moved away from the original problem. would you mind closing the issue and maybe open another one with specific question. I think this way others can benefit too :)

Manish-Aman commented 5 years ago

Hi Just change in zookeeper.service file as Environment="KAFKA_ARGS=-javaagent:/home/ec2-user/prometheus/jmx_prometheus_javaagent-0.3.1.jar=8080:/home/ec2-user/prometheus/kafka-0-8-2.yml" to below and the issue resolved: Environment="KAFKA_OPTS=-javaagent:/home/ec2-user/prometheus/jmx_prometheus_javaagent-0.3.1.jar=8080:/home/ec2-user/prometheus/zookeeper.yml"