Open creater-zq opened 3 years ago
This could have been fixed in this release. Could you try again by installing from master branch?
hello @creater-zq, this had been fixed in #3634, but what made your experiment failed is not this error. Can you try again by installing from master branch or waiting for v2.3 release?
when is the following release? I can't install from source (company proxy + my own limitations). Is there any workaround?
hello @externalsupplierstaff, v2.3 had been released! you can upgrade the version of NNI and try it
Describe the issue:
Environment:
Configuration:
Experiment config (remember to remove secrets!): authorName: default experimentName: test_nni trialConcurrency: 1 maxExecDuration: 1d maxTrialNum: 50
choice: local, remote, pai
trainingServicePlatform: local searchSpacePath: search_space.json
choice: true, false
useAnnotation: false tuner:
choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner
SMAC (SMAC should be installed through nnictl)
builtinTunerName: TPE classArgs:
choice: maximize, minimize
optimize_mode: minimize trial: command: python train_nni.py codeDir: . gpuNum: 1 localConfig: useActiveGpu: true gpuIndices: '3'
Search space: { "start_lr": {"_type": "uniform", "_value": [0.0001, 0.1]}, "batch_size": {"_type": "choice", "_value": [8, 12, 16, 32]} }
Log message:
nnimanager.log: [2021-05-21 17:11:26] INFO [ 'Datastore initialization done' ] [2021-05-21 17:11:26] INFO [ 'RestServer start' ] [2021-05-21 17:11:27] WARNING [ 'Tensorboard may not installed, if you want to use tensorboard, please check if tensorboard installed.' ] [2021-05-21 17:11:27] INFO [ 'RestServer base port is 2222' ] [2021-05-21 17:11:27] INFO [ 'Rest server listening on: http://0.0.0.0:2222' ] [2021-05-21 17:11:30] INFO [ 'Starting experiment: 5TQIR042' ] [2021-05-21 17:11:30] INFO [ 'Setup training service...' ] [2021-05-21 17:11:31] INFO [ 'Construct local machine training service.' ] [2021-05-21 17:11:31] INFO [ 'Setup tuner...' ] [2021-05-21 17:11:31] INFO [ 'Change NNIManager status from: INITIALIZED to: RUNNING' ] [2021-05-21 17:11:31] INFO [ 'Add event listeners' ] [2021-05-21 17:11:31] INFO [ 'Run local machine training service.' ] [2021-05-21 17:12:01] INFO [ 'NNIManager received command from dispatcher: ID, ' ] [2021-05-21 17:12:01] INFO [ 'NNIManager received command from dispatcher: TR, {"parameter_id": 0, "parameter_source": "algorithm", "parameters": {"start_lr": 0.006036625924380562, "batch_size": 16}, "parameter_index": 0}' ] [2021-05-21 17:12:02] INFO [ 'submitTrialJob: form: {"sequenceId":0,"hyperParameters":{"value":"{\"parameter_id\": 0, \"parameter_source\": \"algorithm\", \"parameters\": {\"start_lr\": 0.006036625924380562, \"batch_size\": 16}, \"parameter_index\": 0}","index":0}}' ] [2021-05-21 17:12:33] INFO [ 'Trial job vdk4F status changed from WAITING to FAILED' ]
dispatcher.log: [2021-05-21 17:12:01] INFO (nni.runtime.msg_dispatcher_base/MainThread) Dispatcher started [2021-05-21 17:12:01] INFO (hyperopt.tpe/Thread-1) tpe_transform took 0.001001 seconds [2021-05-21 17:12:01] INFO (hyperopt.tpe/Thread-1) TPE using 0 trials [2021-05-21 17:12:33] INFO (hyperopt.tpe/Thread-1) tpe_transform took 0.001993 seconds [2021-05-21 17:12:33] INFO (hyperopt.tpe/Thread-1) TPE using 0 trials [2021-05-21 17:13:06] INFO (hyperopt.tpe/Thread-1) tpe_transform took 0.002001 seconds [2021-05-21 17:13:06] INFO (hyperopt.tpe/Thread-1) TPE using 0 trials [2021-05-21 17:13:36] INFO (hyperopt.tpe/Thread-1) tpe_transform took 0.002002 seconds
nnictl stdout and stderr:
File "", line 1
'import
^
SyntaxError: EOL while scanning string literal