An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
When I install NNI with source code in master (using source install.sh), I will get the NNI with version v0.5.2 (should be others)
So when I want to run an experiment on PAI, I will meet experiment failed due to version check with docker image "msranni/nni:latest", the error message is "Training service error: NNIManager version is 0.5, TrialKeeper version is 0.6, NNI version does not match!"
Even I change the docker image to "msranni/nni:v0.5", the experiment still failed with message "Training service error: Version check failed, didn't get version check response from trialKeeper, please check your NNI version in NNIManager and TrialKeeper!"
Therefore, the NNI installed by source install cannot be used on PAI.
I have synced with @SparkSnail , now we can skip the version check by adding "--debug" at the create process of the experiment. Actually it's not reasonable. DEBUG mode should be used for debug, not skip version check(and we haven't mentioned it on our doc, may confuse user the activity of --debug).
It is very common to modify the source code and reinstall nni by source install. So it's neccessary for us to solve this version check conflict problem.
Suggestion:
(1) You can add --skip_version_check option to skip this conflict (instead of --debug)
(2) Set the version of the master code to a constant and compatible with all versions
(3) Others.....?
BTW, here is the log of source code installation:
v-shhuan@srgws-06:~/nni_code/nni$ source install.sh
# Extracting Node.js
rm -rf /tmp/v-shhuan/nni-node-linux-x64
mkdir /tmp/v-shhuan/nni-node-linux-x64
tar -xf /tmp/v-shhuan/nni-node-linux-x64.tar.xz -C /tmp/v-shhuan/nni-node-linux-x64 --strip-components 1
mkdir -p /home/v-shhuan/anaconda3/bin
rm -f /home/v-shhuan/anaconda3/bin/node
cp /tmp/v-shhuan/nni-node-linux-x64/bin/node /home/v-shhuan/anaconda3/bin/node
# Extracting Yarn
rm -rf /tmp/v-shhuan/nni-yarn
mkdir /tmp/v-shhuan/nni-yarn
tar -xf /tmp/v-shhuan/nni-yarn.tar.gz -C /tmp/v-shhuan/nni-yarn --strip-components 1
# Building NNI Manager
cd src/nni_manager && PATH=/home/v-shhuan/anaconda3/bin:${PATH} /tmp/v-shhuan/nni-yarn/bin/yarn && PATH=/home/v-shhuan/anaconda3/bin:${PATH} /tmp/v-shhuan/nni-yarn/bin/yarn build
yarn install v1.12.1
[1/5] Validating package.json...
[2/5] Resolving packages...
[3/5] Fetching packages...
[4/5] Linking dependencies...
warning " > express-joi-validator@2.0.0" has unmet peer dependency "joi@6.x.x".
[5/5] Building fresh packages...
success Saved lockfile.
Done in 2.69s.
yarn run v1.12.1
$ tsc
$ cp -rf config ./dist/
Done in 4.56s.
# Building WebUI
cd src/webui && PATH=/home/v-shhuan/anaconda3/bin:${PATH} /tmp/v-shhuan/nni-yarn/bin/yarn && PATH=/home/v-shhuan/anaconda3/bin:${PATH} /tmp/v-shhuan/nni-yarn/bin/yarn build
yarn install v1.12.1
[1/4] Resolving packages...
[2/4] Fetching packages...
info fsevents@1.2.4: The platform "linux" is incompatible with this module.
info "fsevents@1.2.4" is an optional dependency and failed compatibility check. Excluding it from installation.
[3/4] Linking dependencies...
warning "antd > rc-editor-mention@1.1.7" has unmet peer dependency "immutable@^3.7.4".
warning " > echarts-for-react@2.0.14" has unmet peer dependency "prop-types@^15.5.7".
warning " > react-scripts-ts-antd@2.17.0" has incorrect peer dependency "typescript@2.x".
warning "react-scripts-ts-antd > babel-jest@22.4.4" has unmet peer dependency "babel-core@^6.0.0 || ^7.0.0-0".
warning "react-scripts-ts-antd > babel-loader@7.1.5" has unmet peer dependency "babel-core@6".
warning "react-scripts-ts-antd > babel-preset-react-app@3.1.2" has unmet peer dependency "babel-runtime@^6.23.0".
warning "react-scripts-ts-antd > fork-ts-checker-webpack-plugin@0.2.10" has incorrect peer dependency "typescript@^2.1.0".
warning "react-scripts-ts-antd > react-app-rewired@1.6.2" has unmet peer dependency "react-scripts@^1.0.14".
warning "react-scripts-ts-antd > ts-jest@22.0.1" has incorrect peer dependency "typescript@2.x".
[4/4] Building fresh packages...
success Saved lockfile.
Done in 3.11s.
yarn run v1.12.1
$ react-scripts-ts-antd build
Creating an optimized production build...
Starting type checking and linting service...
Using 1 worker with 2048MB memory limit
ts-loader: Using typescript@3.1.3 and /home/v-shhuan/nni_code/nni/src/webui/tsconfig.json
No valid rules have been specified for JavaScript files
Compiled successfully.
File sizes after gzip:
931.63 KB (-309 B) build/static/js/main.e235f4d1.js
41.24 KB build/static/css/main.a73075f2.css
The bundle size is significantly larger than recommended.
Consider reducing it with code splitting: https://goo.gl/9VhYWB
You can also analyze the project dependencies: https://goo.gl/LeUzfb
The project was built assuming it is hosted at the server root.
You can control this with the homepage field in your package.json.
For example, add this to build it for GitHub Pages:
"homepage" : "http://myname.github.io/myapp",
The build folder is ready to be deployed.
You may serve it with a static server:
yarn global add serve
serve -s build
Find out more about deployment here:
http://bit.ly/2vY88Kr
Done in 54.72s.
# Installing Python SDK
sed -ie 's/999.0.0-developing/v0.5.2-111-g6cdea19/' setup.py && python3 -m pip install .
Processing /home/v-shhuan/nni_code/nni
Requirement already satisfied: astor in /home/v-shhuan/.local/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (0.7.1)
Requirement already satisfied: hyperopt in /home/v-shhuan/.local/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (0.1.2)
Requirement already satisfied: json_tricks in /home/v-shhuan/.local/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (3.13.1)
Requirement already satisfied: numpy in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (1.15.4)
Requirement already satisfied: psutil in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (5.4.8)
Requirement already satisfied: pyyaml in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (3.13)
Requirement already satisfied: requests in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (2.21.0)
Requirement already satisfied: scipy in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (1.1.0)
Requirement already satisfied: schema in /home/v-shhuan/.local/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (0.7.0)
Requirement already satisfied: PythonWebHDFS in /home/v-shhuan/.local/lib/python3.7/site-packages (from nni===v0.5.2-111-g6cdea19) (0.2.3)
Requirement already satisfied: pymongo in /home/v-shhuan/.local/lib/python3.7/site-packages (from hyperopt->nni===v0.5.2-111-g6cdea19) (3.7.2)
Requirement already satisfied: networkx in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from hyperopt->nni===v0.5.2-111-g6cdea19) (2.2)
Requirement already satisfied: tqdm in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from hyperopt->nni===v0.5.2-111-g6cdea19) (4.28.1)
Requirement already satisfied: future in /home/v-shhuan/.local/lib/python3.7/site-packages (from hyperopt->nni===v0.5.2-111-g6cdea19) (0.17.1)
Requirement already satisfied: six in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from hyperopt->nni===v0.5.2-111-g6cdea19) (1.12.0)
Requirement already satisfied: certifi>=2017.4.17 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from requests->nni===v0.5.2-111-g6cdea19) (2018.11.29)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from requests->nni===v0.5.2-111-g6cdea19) (3.0.4)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from requests->nni===v0.5.2-111-g6cdea19) (1.24.1)
Requirement already satisfied: idna<2.9,>=2.5 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from requests->nni===v0.5.2-111-g6cdea19) (2.8)
Requirement already satisfied: contextlib2==0.5.5 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from schema->nni===v0.5.2-111-g6cdea19) (0.5.5)
Requirement already satisfied: simplejson in /home/v-shhuan/.local/lib/python3.7/site-packages (from PythonWebHDFS->nni===v0.5.2-111-g6cdea19) (3.16.0)
Requirement already satisfied: decorator>=4.3.0 in /home/v-shhuan/anaconda3/lib/python3.7/site-packages (from networkx->hyperopt->nni===v0.5.2-111-g6cdea19) (4.3.0)
Building wheels for collected packages: nni
Running setup.py bdist_wheel for nni ... done
Stored in directory: /tmp/pip-ephem-wheel-cache-syb9qfsu/wheels/11/21/26/8e0874255a9f897756c89c715bdd54b54ceab968fa217f7f94
Successfully built nni
Installing collected packages: nni
Successfully installed nni-v0.5.2-111-g6cdea19
# Installing NNI Package
rm -rf /home/v-shhuan/anaconda3/nni
cp -r src/nni_manager/dist /home/v-shhuan/anaconda3/nni
cp src/nni_manager/package.json /home/v-shhuan/anaconda3/nni
sed -ie 's/999.0.0-developing/v0.5.2-111-g6cdea19/' /home/v-shhuan/anaconda3/nni/package.json
PATH=/home/v-shhuan/anaconda3/bin:${PATH} /tmp/v-shhuan/nni-yarn/bin/yarn --prod --cwd /home/v-shhuan/anaconda3/nni
yarn install v1.12.1
info No lockfile found.
[1/5] Validating package.json...
[2/5] Resolving packages...
warning express-joi-validator > boom@2.6.1: This version is no longer maintained. Please upgrade to the latest version.
warning express-joi-validator > joi@6.10.1: This version is no longer maintained. Please upgrade to the latest version.
warning express-joi-validator > boom > hoek@2.16.3: This version is no longer maintained. Please upgrade to the latest version.
warning express-joi-validator > joi > hoek@2.16.3: This version is no longer maintained. Please upgrade to the latest version.
warning express-joi-validator > joi > topo@1.1.0: This version is no longer maintained. Please upgrade to the latest version.
warning express-joi-validator > joi > topo > hoek@2.16.3: This version is no longer maintained. Please upgrade to the latest version.
warning rmdir > node.flow > node.extend > object-keys@0.4.0:
[3/5] Fetching packages...
[4/5] Linking dependencies...
warning " > express-joi-validator@2.0.1" has unmet peer dependency "joi@6.x.x".
[5/5] Building fresh packages...
success Saved lockfile.
Done in 15.45s.
cp -r src/webui/build /home/v-shhuan/anaconda3/nni/static
mkdir -p /home/v-shhuan/.bash_completion.d
install -m644 tools/bash-completion /home/v-shhuan/.bash_completion.d/nnictl
# Complete! You may want to add /home/v-shhuan/anaconda3/bin to your PATH environment
# Updating bash configurations
# Complete!
When I install NNI with source code in master (using source install.sh), I will get the NNI with version v0.5.2 (should be others)
So when I want to run an experiment on PAI, I will meet experiment failed due to version check with docker image "msranni/nni:latest", the error message is "Training service error: NNIManager version is 0.5, TrialKeeper version is 0.6, NNI version does not match!"
Even I change the docker image to "msranni/nni:v0.5", the experiment still failed with message "Training service error: Version check failed, didn't get version check response from trialKeeper, please check your NNI version in NNIManager and TrialKeeper!"
Therefore, the NNI installed by source install cannot be used on PAI.
I have synced with @SparkSnail , now we can skip the version check by adding "--debug" at the create process of the experiment. Actually it's not reasonable. DEBUG mode should be used for debug, not skip version check(and we haven't mentioned it on our doc, may confuse user the activity of --debug).
It is very common to modify the source code and reinstall nni by source install. So it's neccessary for us to solve this version check conflict problem.
Suggestion:
(1) You can add --skip_version_check option to skip this conflict (instead of --debug) (2) Set the version of the master code to a constant and compatible with all versions (3) Others.....?
BTW, here is the log of source code installation: