RGF-team / rgf

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.
378 stars 58 forks source link

Model learning result is not found in C:\Users\hp\temp\rgf. This is rgf_python error. #69

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hello,

i have read the previous thread on the same post, but it does not seem to solve my problem, because the previous case had string included in dataset and all i have got are all numbers. Could you please let me know what could be the problem??

Much appreciated !

skf = StratifiedKFold(n_splits = kfold, random_state=1)
for i, (train_index, test_index) in enumerate(skf.split(X, y)):
    X_train, X_eval = X[train_index], X[test_index]
    y_train, y_eval = y[train_index], y[test_index]

    rgf_model = RGFClassifier(max_leaf=400,
                    algorithm="RGF_Sib",
                    test_interval=100,
                    verbose=True).fit( X_train, y_train)
    pred = rgf_model.predict_proba(X_eval)[:,1]
    print( "Gini = ", eval_gini(y_eval, pred) )

and

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-17-b27ba3506d06> in <module>()
     12                     test_interval=100,
     13                     verbose=True).fit( X_train, y_train)
---> 14     pred = rgf_model.predict_proba(X_eval)[:,1]
     15     print( "Gini = ", eval_gini(y_eval, pred) )

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in predict_proba(self, X)
    644                              % (self._n_features, n_features))
    645         if self._n_classes == 2:
--> 646             y = self._estimators[0].predict_proba(X)
    647             y = _sigmoid(y)
    648             y = np.c_[y, 1 - y]

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in predict_proba(self, X)
    796         if not model_files:
    797             raise Exception('Model learning result is not found in {0}. '
--> 798                             'This is rgf_python error.'.format(_TEMP_PATH))
    799         latest_model_loc = sorted(model_files, reverse=True)[0]
    800 

Exception: Model learning result is not found in C:\Users\hp\temp\rgf. This is rgf_python error.
StrikerRUS commented 6 years ago

Hi @mike-m123 !

Please provide more information about your OS, packages versions you use and data (if it isn't confidential).

I cannot reproduce the error with toy dataset:

from rgf.sklearn import RGFClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import StratifiedKFold

iris = load_iris()
X = iris.data
y = iris.target
kfold = 4

def eval_gini(y_true, y_pred):
    return 42

# Your piece of script here
skf = StratifiedKFold(n_splits = kfold, random_state=1)
for i, (train_index, test_index) in enumerate(skf.split(X, y)):
    X_train, X_eval = X[train_index], X[test_index]
    y_train, y_eval = y[train_index], y[test_index]

    rgf_model = RGFClassifier(max_leaf=400,
                    algorithm="RGF_Sib",
                    test_interval=100,
                    verbose=True).fit( X_train, y_train)
    pred = rgf_model.predict_proba(X_eval)[:,1]
    print( "Gini = ", eval_gini(y_eval, pred) )

Everything works OK:

"predict": 
   model_fn=D:\rgf\temp\5ce6d727-bc24-417c-be89-3bcb2a4f0f1a4.model-04
   test_x_fn=D:\rgf\temp\5ce6d727-bc24-417c-be89-3bcb2a4f0f1a4.test.data.x
   prediction_fn=D:\rgf\temp\5ce6d727-bc24-417c-be89-3bcb2a4f0f1a4.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:44 2017: Reading test data ... 
Thu Nov 09 12:59:44 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\5ce6d727-bc24-417c-be89-3bcb2a4f0f1a4.predictions.txt: D:\rgf\temp\5ce6d727-bc24-417c-be89-3bcb2a4f0f1a4.model-04,#leaf=400,#tree=200
Thu Nov 09 12:59:44 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\db7eff00-7dd8-4070-b943-387a5eb1724a5.model-04
   test_x_fn=D:\rgf\temp\db7eff00-7dd8-4070-b943-387a5eb1724a5.test.data.x
   prediction_fn=D:\rgf\temp\db7eff00-7dd8-4070-b943-387a5eb1724a5.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:44 2017: Reading test data ... 
Thu Nov 09 12:59:44 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\db7eff00-7dd8-4070-b943-387a5eb1724a5.predictions.txt: D:\rgf\temp\db7eff00-7dd8-4070-b943-387a5eb1724a5.model-04,#leaf=400,#tree=117
Thu Nov 09 12:59:44 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\416c3654-ce1c-4c55-9b85-bb74b8fa4ba26.model-04
   test_x_fn=D:\rgf\temp\416c3654-ce1c-4c55-9b85-bb74b8fa4ba26.test.data.x
   prediction_fn=D:\rgf\temp\416c3654-ce1c-4c55-9b85-bb74b8fa4ba26.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:44 2017: Reading test data ... 
Thu Nov 09 12:59:44 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\416c3654-ce1c-4c55-9b85-bb74b8fa4ba26.predictions.txt: D:\rgf\temp\416c3654-ce1c-4c55-9b85-bb74b8fa4ba26.model-04,#leaf=401,#tree=155
Thu Nov 09 12:59:44 2017: Done ... 

None
Gini =  42
"predict": 
   model_fn=D:\rgf\temp\6944c1eb-c19c-459c-a7df-08e84dbcef637.model-04
   test_x_fn=D:\rgf\temp\6944c1eb-c19c-459c-a7df-08e84dbcef637.test.data.x
   prediction_fn=D:\rgf\temp\6944c1eb-c19c-459c-a7df-08e84dbcef637.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:45 2017: Reading test data ... 
Thu Nov 09 12:59:45 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\6944c1eb-c19c-459c-a7df-08e84dbcef637.predictions.txt: D:\rgf\temp\6944c1eb-c19c-459c-a7df-08e84dbcef637.model-04,#leaf=400,#tree=200
Thu Nov 09 12:59:45 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\16669df4-bed5-4a68-a830-f550d917e0228.model-04
   test_x_fn=D:\rgf\temp\16669df4-bed5-4a68-a830-f550d917e0228.test.data.x
   prediction_fn=D:\rgf\temp\16669df4-bed5-4a68-a830-f550d917e0228.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:45 2017: Reading test data ... 
Thu Nov 09 12:59:45 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\16669df4-bed5-4a68-a830-f550d917e0228.predictions.txt: D:\rgf\temp\16669df4-bed5-4a68-a830-f550d917e0228.model-04,#leaf=400,#tree=124
Thu Nov 09 12:59:45 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\6eff0c94-1db6-4878-87bb-5e689a3421b89.model-04
   test_x_fn=D:\rgf\temp\6eff0c94-1db6-4878-87bb-5e689a3421b89.test.data.x
   prediction_fn=D:\rgf\temp\6eff0c94-1db6-4878-87bb-5e689a3421b89.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:45 2017: Reading test data ... 
Thu Nov 09 12:59:45 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\6eff0c94-1db6-4878-87bb-5e689a3421b89.predictions.txt: D:\rgf\temp\6eff0c94-1db6-4878-87bb-5e689a3421b89.model-04,#leaf=401,#tree=145
Thu Nov 09 12:59:45 2017: Done ... 

None
Gini =  42
"predict": 
   model_fn=D:\rgf\temp\d45f0bf3-d43d-4858-b382-5e34d2b748bf10.model-04
   test_x_fn=D:\rgf\temp\d45f0bf3-d43d-4858-b382-5e34d2b748bf10.test.data.x
   prediction_fn=D:\rgf\temp\d45f0bf3-d43d-4858-b382-5e34d2b748bf10.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:46 2017: Reading test data ... 
Thu Nov 09 12:59:46 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\d45f0bf3-d43d-4858-b382-5e34d2b748bf10.predictions.txt: D:\rgf\temp\d45f0bf3-d43d-4858-b382-5e34d2b748bf10.model-04,#leaf=400,#tree=200
Thu Nov 09 12:59:46 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\ea331c3f-2c02-47e9-8a87-f1f0f7394f7211.model-04
   test_x_fn=D:\rgf\temp\ea331c3f-2c02-47e9-8a87-f1f0f7394f7211.test.data.x
   prediction_fn=D:\rgf\temp\ea331c3f-2c02-47e9-8a87-f1f0f7394f7211.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:46 2017: Reading test data ... 
Thu Nov 09 12:59:46 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\ea331c3f-2c02-47e9-8a87-f1f0f7394f7211.predictions.txt: D:\rgf\temp\ea331c3f-2c02-47e9-8a87-f1f0f7394f7211.model-04,#leaf=400,#tree=122
Thu Nov 09 12:59:46 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\4ad51b25-3e1e-46c4-8148-b796c0208f9312.model-04
   test_x_fn=D:\rgf\temp\4ad51b25-3e1e-46c4-8148-b796c0208f9312.test.data.x
   prediction_fn=D:\rgf\temp\4ad51b25-3e1e-46c4-8148-b796c0208f9312.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:46 2017: Reading test data ... 
Thu Nov 09 12:59:46 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\4ad51b25-3e1e-46c4-8148-b796c0208f9312.predictions.txt: D:\rgf\temp\4ad51b25-3e1e-46c4-8148-b796c0208f9312.model-04,#leaf=401,#tree=173
Thu Nov 09 12:59:46 2017: Done ... 

None
Gini =  42
"predict": 
   model_fn=D:\rgf\temp\9b48cb12-6b77-4719-9d8d-416be6de6f2b13.model-04
   test_x_fn=D:\rgf\temp\9b48cb12-6b77-4719-9d8d-416be6de6f2b13.test.data.x
   prediction_fn=D:\rgf\temp\9b48cb12-6b77-4719-9d8d-416be6de6f2b13.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:47 2017: Reading test data ... 
Thu Nov 09 12:59:47 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\9b48cb12-6b77-4719-9d8d-416be6de6f2b13.predictions.txt: D:\rgf\temp\9b48cb12-6b77-4719-9d8d-416be6de6f2b13.model-04,#leaf=400,#tree=200
Thu Nov 09 12:59:47 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\afa4fc0b-43ca-4534-9576-3bd9abc23aea14.model-04
   test_x_fn=D:\rgf\temp\afa4fc0b-43ca-4534-9576-3bd9abc23aea14.test.data.x
   prediction_fn=D:\rgf\temp\afa4fc0b-43ca-4534-9576-3bd9abc23aea14.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:47 2017: Reading test data ... 
Thu Nov 09 12:59:47 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\afa4fc0b-43ca-4534-9576-3bd9abc23aea14.predictions.txt: D:\rgf\temp\afa4fc0b-43ca-4534-9576-3bd9abc23aea14.model-04,#leaf=400,#tree=106
Thu Nov 09 12:59:47 2017: Done ... 

None
"predict": 
   model_fn=D:\rgf\temp\9c041810-a234-4ed8-8e97-0a4cf9bdc9b415.model-04
   test_x_fn=D:\rgf\temp\9c041810-a234-4ed8-8e97-0a4cf9bdc9b415.test.data.x
   prediction_fn=D:\rgf\temp\9c041810-a234-4ed8-8e97-0a4cf9bdc9b415.predictions.txt
   Log:ON
--------------------
Thu Nov 09 12:59:47 2017: Reading test data ... 
Thu Nov 09 12:59:47 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\9c041810-a234-4ed8-8e97-0a4cf9bdc9b415.predictions.txt: D:\rgf\temp\9c041810-a234-4ed8-8e97-0a4cf9bdc9b415.model-04,#leaf=400,#tree=130
Thu Nov 09 12:59:47 2017: Done ... 

None
Gini =  42
fukatani commented 6 years ago

I recommend to try verbose=1 option.

ghost commented 6 years ago

Hello @StrikerRUS , thank you for your prompt response !!

Here is the additional information, Windows version is windows 10, python environment is python 3.6, and python package versions are the following.

rgf > 0.2.0.1 rgf-python > 2.0.3

Dataset is that of porto-seguro at kaggle compeition. Here is the link https://www.kaggle.com/c/porto-seguro-safe-driver-prediction/data.

Hi @fukatani , thank you for your response. Changing the verbose from True to 1 has not changed anything. The same error still persists. Please advise,

Ran 0 examples: 0 success, 0 failure, 0 error

None
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-39-e0cd5d6f4e0a> in <module>()
     12                     test_interval=100,
     13                     verbose=1).fit( X_train, y_train)
---> 14     pred = rgf.predict_proba(X_eval)[:,1]
     15     print( "Gini = ", eval_gini(y_eval, pred) )

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in predict_proba(self, X)
    644                              % (self._n_features, n_features))
    645         if self._n_classes == 2:
--> 646             y = self._estimators[0].predict_proba(X)
    647             y = _sigmoid(y)
    648             y = np.c_[y, 1 - y]

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in predict_proba(self, X)
    796         if not model_files:
    797             raise Exception('Model learning result is not found in {0}. '
--> 798                             'This is rgf_python error.'.format(_TEMP_PATH))
    799         latest_model_loc = sorted(model_files, reverse=True)[0]
    800 

Exception: Model learning result is not found in C:\Users\hp\temp\rgf. This is rgf_python error.
fukatani commented 6 years ago

@mike-m123 Thank you for trying verbose. Not only Python traceback, but also C++ output can be effective for debugging.

Could you paste whole log from rgf_python? If log is too long, at least, I want last 50 lines.

And @StrikerRUS's script is working on your envirionment? If so, it may be dataset problem.

StrikerRUS commented 6 years ago

@mike-m123 I have the same OS, Python version and rgf version. But I cannot reproduce the error even with the data you've provided:

import numpy as np
import pandas as pd

from rgf.sklearn import RGFClassifier
from sklearn.model_selection import StratifiedKFold

def eval_gini(y_true, y_pred):
    return 42

kfold = 3

df = pd.read_csv(r'D:\Users\nekit\Downloads\train\train.csv')

df = df[:1000]

print(df.shape)
print(df.columns)

X = df.drop(['target'], axis=1).as_matrix()
y = df['target'].as_matrix().flatten()

# Your piece of script here
skf = StratifiedKFold(n_splits = kfold, random_state=1)
for i, (train_index, test_index) in enumerate(skf.split(X, y)):
    X_train, X_eval = X[train_index], X[test_index]
    y_train, y_eval = y[train_index], y[test_index]

    rgf_model = RGFClassifier(max_leaf=400,
                    algorithm="RGF_Sib",
                    test_interval=100,
                    verbose=True).fit( X_train, y_train)
    pred = rgf_model.predict_proba(X_eval)[:,1]
    print( "Gini = ", eval_gini(y_eval, pred) )

The output is:

(1000, 59)
Index(['id', 'target', 'ps_ind_01', 'ps_ind_02_cat', 'ps_ind_03',
       'ps_ind_04_cat', 'ps_ind_05_cat', 'ps_ind_06_bin', 'ps_ind_07_bin',
       'ps_ind_08_bin', 'ps_ind_09_bin', 'ps_ind_10_bin', 'ps_ind_11_bin',
       'ps_ind_12_bin', 'ps_ind_13_bin', 'ps_ind_14', 'ps_ind_15',
       'ps_ind_16_bin', 'ps_ind_17_bin', 'ps_ind_18_bin', 'ps_reg_01',
       'ps_reg_02', 'ps_reg_03', 'ps_car_01_cat', 'ps_car_02_cat',
       'ps_car_03_cat', 'ps_car_04_cat', 'ps_car_05_cat', 'ps_car_06_cat',
       'ps_car_07_cat', 'ps_car_08_cat', 'ps_car_09_cat', 'ps_car_10_cat',
       'ps_car_11_cat', 'ps_car_11', 'ps_car_12', 'ps_car_13', 'ps_car_14',
       'ps_car_15', 'ps_calc_01', 'ps_calc_02', 'ps_calc_03', 'ps_calc_04',
       'ps_calc_05', 'ps_calc_06', 'ps_calc_07', 'ps_calc_08', 'ps_calc_09',
       'ps_calc_10', 'ps_calc_11', 'ps_calc_12', 'ps_calc_13', 'ps_calc_14',
       'ps_calc_15_bin', 'ps_calc_16_bin', 'ps_calc_17_bin', 'ps_calc_18_bin',
       'ps_calc_19_bin', 'ps_calc_20_bin'],
      dtype='object')

"train": 
   algorithm=RGF_Sib
   train_x_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.train.data.x
   train_y_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.train.data.y
   train_w_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.train.data.weight
   Log:ON
   model_fn_prefix=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model
--------------------
Fri Nov 10 23:37:40 2017: Reading training data ... 
Fri Nov 10 23:37:40 2017: Start ... #train=666
--------------------
Forest-level: 
   loss=Log
   max_leaf_forest=400
   max_tree=200
   opt_interval=100
   test_interval=100
   num_tree_search=1
   Verbose:ON
   memory_policy=Generous
Turning on Force_to_refresh_all
-------------
Training data: 58x666, nonzero_ratio=0.6725; managed as dense data.
-------------
Optimization: 
   loss=Log
   num_iteration_opt=5
   reg_L2=0.1
   opt_stepsize=0.5
   max_delta=1
Tree-level: min_pop=10
Node split: reg_L2=0.1
--------------------
Sum of data point weights = 666
--------------------
Fri Nov 10 23:37:40 2017: Calling optimizer with 18 trees and 100 leaves
Fri Nov 10 23:37:40 2017: Writing model: seq#=1
Fri Nov 10 23:37:40 2017: Calling optimizer with 30 trees and 200 leaves
Fri Nov 10 23:37:40 2017: Writing model: seq#=2
Fri Nov 10 23:37:40 2017: Calling optimizer with 40 trees and 300 leaves
Fri Nov 10 23:37:40 2017: Writing model: seq#=3
Fri Nov 10 23:37:41 2017: AzRgforest: #leaf reached max
Fri Nov 10 23:37:41 2017: Calling optimizer with 51 trees and 400 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=4

Generated 4 model file(s): 
D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-01
D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-02
D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-03
D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-04

Fri Nov 10 23:37:41 2017: Done ... 
elapsed: 0.141

None
"predict": 
   model_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-04
   test_x_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.test.data.x
   prediction_fn=D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.predictions.txt
   Log:ON
--------------------
Fri Nov 10 23:37:41 2017: Reading test data ... 
Fri Nov 10 23:37:41 2017: Predicting ... 
elapsed: 0.015
D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.predictions.txt: D:\rgf\temp\4ea091d2-20d6-44bc-a799-4923976a88021.model-04,#leaf=400,#tree=51
Fri Nov 10 23:37:41 2017: Done ... 

None
Gini =  42
"train": 
   algorithm=RGF_Sib
   train_x_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.train.data.x
   train_y_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.train.data.y
   train_w_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.train.data.weight
   Log:ON
   model_fn_prefix=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model
--------------------
Fri Nov 10 23:37:41 2017: Reading training data ... 
Fri Nov 10 23:37:41 2017: Start ... #train=667
--------------------
Forest-level: 
   loss=Log
   max_leaf_forest=400
   max_tree=200
   opt_interval=100
   test_interval=100
   num_tree_search=1
   Verbose:ON
   memory_policy=Generous
Turning on Force_to_refresh_all
-------------
Training data: 58x667, nonzero_ratio=0.6717; managed as dense data.
-------------
Optimization: 
   loss=Log
   num_iteration_opt=5
   reg_L2=0.1
   opt_stepsize=0.5
   max_delta=1
Tree-level: min_pop=10
Node split: reg_L2=0.1
--------------------
Sum of data point weights = 667
--------------------
Fri Nov 10 23:37:41 2017: Calling optimizer with 21 trees and 100 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=1
Fri Nov 10 23:37:41 2017: Calling optimizer with 32 trees and 200 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=2
Fri Nov 10 23:37:41 2017: Calling optimizer with 41 trees and 301 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=3
Fri Nov 10 23:37:41 2017: AzRgforest: #leaf reached max
Fri Nov 10 23:37:41 2017: Calling optimizer with 48 trees and 400 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=4

Generated 4 model file(s): 
D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-01
D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-02
D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-03
D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-04

Fri Nov 10 23:37:41 2017: Done ... 
elapsed: 0.152

None
"predict": 
   model_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-04
   test_x_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.test.data.x
   prediction_fn=D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.predictions.txt
   Log:ON
--------------------
Fri Nov 10 23:37:41 2017: Reading test data ... 
Fri Nov 10 23:37:41 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.predictions.txt: D:\rgf\temp\a8a4d3fd-9b3c-47ce-9967-f3f936a0e58b2.model-04,#leaf=400,#tree=48
Fri Nov 10 23:37:41 2017: Done ... 

None
Gini =  42
"train": 
   algorithm=RGF_Sib
   train_x_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.train.data.x
   train_y_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.train.data.y
   train_w_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.train.data.weight
   Log:ON
   model_fn_prefix=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model
--------------------
Fri Nov 10 23:37:41 2017: Reading training data ... 
Fri Nov 10 23:37:41 2017: Start ... #train=667
--------------------
Forest-level: 
   loss=Log
   max_leaf_forest=400
   max_tree=200
   opt_interval=100
   test_interval=100
   num_tree_search=1
   Verbose:ON
   memory_policy=Generous
Turning on Force_to_refresh_all
-------------
Training data: 58x667, nonzero_ratio=0.674; managed as dense data.
-------------
Optimization: 
   loss=Log
   num_iteration_opt=5
   reg_L2=0.1
   opt_stepsize=0.5
   max_delta=1
Tree-level: min_pop=10
Node split: reg_L2=0.1
--------------------
Sum of data point weights = 667
--------------------
Fri Nov 10 23:37:41 2017: Calling optimizer with 21 trees and 100 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=1
Fri Nov 10 23:37:41 2017: Calling optimizer with 32 trees and 200 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=2
Fri Nov 10 23:37:41 2017: Calling optimizer with 42 trees and 300 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=3
Fri Nov 10 23:37:41 2017: AzRgforest: #leaf reached max
Fri Nov 10 23:37:41 2017: Calling optimizer with 53 trees and 400 leaves
Fri Nov 10 23:37:41 2017: Writing model: seq#=4

Generated 4 model file(s): 
D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-01
D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-02
D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-03
D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-04

Fri Nov 10 23:37:41 2017: Done ... 
elapsed: 0.148

None
"predict": 
   model_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-04
   test_x_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.test.data.x
   prediction_fn=D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.predictions.txt
   Log:ON
--------------------
Fri Nov 10 23:37:41 2017: Reading test data ... 
Fri Nov 10 23:37:41 2017: Predicting ... 
elapsed: 0
D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.predictions.txt: D:\rgf\temp\449a94d6-a2ab-426a-9428-3b6a2606c2233.model-04,#leaf=400,#tree=53
Fri Nov 10 23:37:41 2017: Done ... 

None
Gini =  42

Please run my examples and tell whether the error exists. As a workaround you could rebuild rgf library from sources and replace your version (maybe there were errors during compilation).

StrikerRUS commented 6 years ago

Also please check whether you have wx rights for C:\Users\hp\temp\rgf.

StrikerRUS commented 6 years ago

@mike-m123 updates?

ghost commented 6 years ago

@StrikerRUS , apologies for late reply, i stopped testing this algorithm altogether, apparently, it would still not work and the same error persists.

Seems like permissions issue. You have talked about the "wx" rights, how can we check/enable them in windows 10 ??

StrikerRUS commented 6 years ago

@mike-m123 No problems!

The easiest way is to create config file and specify in it any folder you like and have permissions to write temp files.

you may specify actual location of RGF executable file and directory for placing temp files by corresponding flags in configuration file .rgfrc, which you should create into your home directory. Here is the example of .rgfrc file:

exe_location=C:/Program Files/RGF/bin/rgf.exe
temp_location=C:/Program Files/RGF/temp

In case of Windows you have to create this file in C:\Users\your_folder.

StrikerRUS commented 6 years ago

@mike-m123 Also try to run your code without the prediction, just fit rgf with verbose parameter and post the log here.

StrikerRUS commented 6 years ago

@mike-m123 updates?

You also may try to install new version (2.1.2) by pip.

ghost commented 6 years ago

@StrikerRUS Thanks for the continued support, but i was not able to resolve it.

The specific paths are specified in file rgf.rgfrc (already was there) and locatedd in home directory of the administrator. i.e C:\users\hp\ but the same error still persists when we use the .predict command. With just fit however(3 fold operation), no error pops up, but then it does NOT execute operation on data either. Here was the output with just fit operation.

Ran 0 examples: 0 success, 0 failure, 0 error

None
Ran 0 examples: 0 success, 0 failure, 0 error

None
Ran 0 examples: 0 success, 0 failure, 0 error

None 

Here are the exact directory paths specified in rgf.rgfrc

C:\Anaconda3\Lib\site-packages\rgf\rgf.exe C:\Users\hp\temp\rgf

StrikerRUS commented 6 years ago

@mike-m123 It's very strange because this's not the output from rgf. OK, none maybe is produced by rgf, but not this line

Ran 0 examples: 0 success, 0 failure, 0 error

Have you updated rgf_python to 2.1.2 version?

Could you simply run this command:

C:\Anaconda3\Lib\site-packages\rgf\rgf.exe -h

?

BTW, config file should be named .rgfrc, not rgf.rgfrc.

ghost commented 6 years ago

@StrikerRUS , okay now i have updated the package rgf-python to 2.1.2 and the following errors have come up

Traceback (most recent call last):
  File "c:\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Anaconda3\Scripts\rgf.exe\__main__.py", line 5, in <module>
ModuleNotFoundError: No module named 'rgf.core'

None
Traceback (most recent call last):
  File "c:\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Anaconda3\Scripts\rgf.exe\__main__.py", line 5, in <module>
ModuleNotFoundError: No module named 'rgf.core'

None
Traceback (most recent call last):
  File "c:\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Anaconda3\Scripts\rgf.exe\__main__.py", line 5, in <module>
ModuleNotFoundError: No module named 'rgf.core'

None

Here is the command result that you have asked for

(C:\Anaconda3) C:\Anaconda3\Lib\site-packages\rgf>rgf.exe -h
Arguments: action  parameters
   action: train|predict|train_test|train_predict|output_features
           train      ...    Train and save models to files.
           predict    ...    Apply a model saved by "train" to new data.
           train_test ...    Train and test models.  Optionally models can be
                             saved to files.
           train_predict ... Train models and save predictions on test data to
                             files.  Models can also be saved to files.
           output_features ...
                             Output features generated by tree ensembles.

To get help on parameters, enter rgf.exe action.
For example:  rgf.exe train_test
              rgf.exe train

i have changed the filename to only .rgfrc

StrikerRUS commented 6 years ago

Please remove this file C:\Anaconda3\Scripts\rgf.exe

ghost commented 6 years ago

Now this error has popped up out of no where, that too in the beginning

from rgf.sklearn import RGFClassifier

MissingSectionHeaderError                 Traceback (most recent call last)
C:\Anaconda3\lib\site-packages\rgf\sklearn.py in _get_paths()
     43             with six.StringIO(cfg.read()) as strIO:
---> 44                 config.readfp(strIO)
     45     except six.moves.configparser.MissingSectionHeaderError:

C:\Anaconda3\lib\configparser.py in readfp(self, fp, filename)
    762         )
--> 763         self.read_file(fp, source=filename)
    764 

C:\Anaconda3\lib\configparser.py in read_file(self, f, source)
    717                 source = '<???>'
--> 718         self._read(f, source)
    719 

C:\Anaconda3\lib\configparser.py in _read(self, fp, fpname)
   1079                 elif cursect is None:
-> 1080                     raise MissingSectionHeaderError(fpname, lineno, line)
   1081                 # an option line?

MissingSectionHeaderError: File contains no section headers.
file: '<???>', line: 4
'C:\\Anaconda3\\Lib\\site-packages\\rgf\\rgf.exe\r\n'

During handling of the above exception, another exception occurred:

DuplicateOptionError                      Traceback (most recent call last)
<ipython-input-2-7de12521e932> in <module>()
     13 
     14 # Regularized Greedy Forest
---> 15 from rgf.sklearn import RGFClassifier     # https://github.com/fukatani/rgf_python
     16 
     17 

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in <module>()
     74 
     75 
---> 76 _DEFAULT_EXE_PATH, _EXE_PATH, _TEMP_PATH = _get_paths()
     77 
     78 

C:\Anaconda3\lib\site-packages\rgf\sklearn.py in _get_paths()
     46         with codecs.open(path, 'r', 'utf-8') as cfg:
     47             with six.StringIO('[glob]\n' + cfg.read()) as strIO:
---> 48                 config.readfp(strIO)
     49     except Exception:
     50         pass

C:\Anaconda3\lib\configparser.py in readfp(self, fp, filename)
    761             DeprecationWarning, stacklevel=2
    762         )
--> 763         self.read_file(fp, source=filename)
    764 
    765     def get(self, section, option, *, raw=False, vars=None, fallback=_UNSET):

C:\Anaconda3\lib\configparser.py in read_file(self, f, source)
    716             except AttributeError:
    717                 source = '<???>'
--> 718         self._read(f, source)
    719 
    720     def read_string(self, string, source='<string>'):

C:\Anaconda3\lib\configparser.py in _read(self, fp, fpname)
   1090                             (sectname, optname) in elements_added):
   1091                             raise DuplicateOptionError(sectname, optname,
-> 1092                                                        fpname, lineno)
   1093                         elements_added.add((sectname, optname))
   1094                         # This check is fine because the OPTCRE cannot

DuplicateOptionError: While reading from '<???>' [line  6]: option 'c' in section 'glob' already exists
StrikerRUS commented 6 years ago

Try to remove .rgfrc file from HOME directory.

And attach it here please.

ghost commented 6 years ago

It is working , @StrikerRUS Thank you so much for your help. All i did was specify the paths in .rgfrc file like this

exe_location=C:\Anaconda3\Lib\site-packages\rgf\rgf.exe temp_location=C:\Users\hp\temp\rgf

instead of

C:\Anaconda3\Lib\site-packages\rgf\rgf.exe C:\Users\hp\temp\rgf

StrikerRUS commented 6 years ago

@mike-m123 You are welcome! Thanks for your patience. Nice to hear that finally it working.

ghost commented 6 years ago

@StrikerRUS it s you who were patient, thank you for the consistent reminders !!