KersyTom commented 3 years ago

Describe the issue: The example of sklearn is executed under the windows platform, and the trial status on the web-ui is always displayed as waiting

Environment:

NNI version:2.3
Training service (local|remote|pai|aml|etc):local
Client OS:windows7
Server OS (for remote mode only):
Python version:3.6.3
PyTorch/TensorFlow version:None
Is conda/virtualenv/venv used?:virtualenv
Is running in Docker?:no

Configuration:

Experiment config (remember to remove secrets!):
Search space: { "model_name":{"_type":"choice","_value":["LinearRegression", "Lars", "Ridge", "ARDRegression"]}, "normalize": {"_type":"choice","_value":["true", "false"]} }

Log message:normal

nnimanager.log:
dispatcher.log:
- state:

nnictl stdout and stderr:nothing

main.py


import nni
from nni.experiment import ExperimentConfig
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn import linear_model
import logging
import numpy as np
from sklearn.metrics import r2_score
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lars
from sklearn.linear_model import ARDRegression

LOG = logging.getLogger('sklearn_regression')

def load_data(): '''Load dataset, use boston dataset''' boston = load_boston() X_train, X_test, y_train, y_test = train_test_split( boston.data, boston.target, random_state=99, test_size=0.25)

normalize data

ss_X = StandardScaler()
ss_y = StandardScaler()

X_train = ss_X.fit_transform(X_train)
X_test = ss_X.transform(X_test)
y_train = ss_y.fit_transform(y_train[:, None])[:, 0]
y_test = ss_y.transform(y_test[:, None])[:, 0]

return X_train, X_test, y_train, y_test

def get_default_parameters(): '''get default parameters''' params = {'model_name': 'LinearRegression'} return params

def get_model(PARAMS): '''Get model according to parameters''' model_dict = { 'LinearRegression': LinearRegression(), 'Ridge': Ridge(), 'Lars': Lars(), 'ARDRegression': ARDRegression()

}
if not model_dict.get(PARAMS['model_name']):
    LOG.exception('Not supported model!')
    exit(1)

model = model_dict[PARAMS['model_name']]
# model.normalize = bool(PARAMS['normalize'])

return model

def run(X_train, X_test, y_train, y_test, model): '''Train model and predict result''' model.fit(X_train, y_train) predict_y = model.predict(X_test) score = r2_score(y_test, predict_y) LOG.debug('r2 score: %s', score) nni.report_final_result(score)

if name == 'main': X_train, X_test, y_train, y_test = load_data()

try:
    # get parameters from tuner
    RECEIVED_PARAMS = nni.get_next_parameter()
    LOG.debug(RECEIVED_PARAMS)
    PARAMS = get_default_parameters()
    PARAMS.update(RECEIVED_PARAMS)
    LOG.debug(PARAMS)
    model = get_model(PARAMS)
    run(X_train, X_test, y_train, y_test, model)
except Exception as exception:
    LOG.exception(exception)
    raise

- conf.yaml
```yaml
searchSpaceFile: search_space.json
trialCommand: python main.py
trialConcurrency: 1
maxTrialNumber: 30
maxExperimentDuration: 1h
tuner:
  name: TPE
  classArgs:
    optimize_mode: maximize
trainingService:  # For other platforms, check mnist-pytorch example
  platform: local
  use_active_gpu: false

How to reproduce it?:

QuanluZhang commented 3 years ago

@KersyTom ，could you try to set use_active_gpu true, or remove this field from conf.yaml, to check whether it works?

KersyTom commented 3 years ago

@QuanluZhang,hi quanlu,thanks for your reply,I have changed these configurations, and my computer does not have a GPU device, but the problem still exists, the following is my screenshot remove_gpu_config set_true

QuanluZhang commented 3 years ago

@KersyTom , I cannot reproduce your problem using your provided code and configuration. my environment is

NNI version:2.3 Training service (local|remote|pai|aml|etc):local Client OS:windows10 Server OS (for remote mode only): Python version:3.8.8 PyTorch/TensorFlow version:None Is conda/virtualenv/venv used?:conda Is running in Docker?:no

QuanluZhang commented 3 years ago

@KersyTom , if you are still encountering this problem, please provide the complete log file. From the log you have provided above, nnimanager has received the first trial config, it is strange that this trial is waiting.

KersyTom commented 3 years ago

@QuanluZhang, my CLient OS is win7, I just used my colleague’s computer, and it runs successfully, i'm not sure my experiment runs failed caused by my win7 os，i will try again if i can find another win7 computer

microsoft / nni

window sklearn regression example trial keeps waiting #3883

normalize data