genfifth / cvopt

Machine learning's parameter search and feature selection module which is integrated log management and visualization.
BSD 2-Clause "Simplified" License
13 stars 6 forks source link

Could you create a tutorial on using it without any dataset? #2

Open KOLANICH opened 6 years ago

genfifth commented 6 years ago

What do you mean "without any dataset" ? (For example, generate random number in begin of tutorial, and this library is applied to it.)

KOLANICH commented 6 years ago

I mean as a black box optimizer.

In fact I have a dataset, but prefer not to use sklearn for crossvalidation because it is damn slow. Instead I call the xgboost function doing cv in native code using a precreated object with data.

genfifth commented 6 years ago

This library is based on sklearn crossvalidation class and changing not to use it sounds difficult.

Can you have sample code for native code cv? I will think about this change.

KOLANICH commented 6 years ago

Can you have sample code for native code cv?

Just assumme it as a function call. For example you can try Rosenbrock function, as this one. Your code passes hyperparams to a func. The func returns loss mean value and variance. The framework I have built around xgbooost assummes that I can pass a grid spec and a func into optimizer and get then the hyperparams corresponding to minimal loss. Most of black box optimizers and hyperparams optimizers work this way, and I guess that all the hyperparams optimizers use a black box optimizer internally somewhere. For me it would be easier if hyperparams were passed into function as a dict rather than an array.

genfifth commented 6 years ago

I can pass a grid spec and a func into optimizer.

Internally, this library pass hyper parameter space and a function into optimizer too. This function input hyper parameter and output cross validation score. In this function, run cross validation using dataset and an estimator given in advance.

Similarly, don't your framework pass hyper parameter space and a function(internally train xgboost and return predict score)into optimizer ?

KOLANICH commented 6 years ago

Similarly, don't your framework pass hyper parameter space and a function(internally train xgboost and return predict score)into optimizer ?

It does. For example https://gitlab.com/KOLANICH/UniOpt.py/blob/master/UniOpt/backends/hyperopt.py#L137. I agree that it is more correctly to call that variable spaceSpec rather than gridSpec.

genfifth commented 6 years ago

Thank you for example. Your code use "xgb.cv", so fast than sklearn-cv. On the other hand, my function (pass into optimizer) is depend on sklearn-cv and I can not support "xgb.cv" right away. https://github.com/genfifth/cvopt/blob/master/cvopt/model_selection/_base.py#L346

I will think about support "xgb.cv" when I have a time.

KOLANICH commented 6 years ago

I will think about support "xgb.cv" when I have a time.

I meant not this, I meant an example of optimizing any function. I can look into the source myself to try to figure out how to do it, but for now I just have no time for that.

genfifth commented 6 years ago

I pass the following function to optimizer and optimizing it.

def f(hyper parameter):
   -----
   # run sklean cross validation.
   # (dataset and an estimator are given before using ".fit" method.)
   -----
   return average_score

As I said before, this based on the sklearn crossvalidation class. I thought, but it seems to be difficult to optimize any function such as Rosenbrock function.

jujubbgg commented 4 years ago

why i have this problem ? image

KOLANICH commented 4 years ago

@jujubbgg, because you are doing everything completely wrong.

  1. you are posting into a completely irrelevant issue
  2. you have ignored the error message describing your problem.
  3. you havenposted a screenshot instead of text
  4. you have provided insufficient information needed for diagnostics. The error message implies that model_selection dir is a package root. So tye obvious question is "what's your cwd?" But as the path looks like you haven't installed the package I can deduce that you have just cded to model_selection and tried to run from it.