Closed DanielAtKrypton closed 3 years ago
What happens if you set n_jobs=1
in TuneGridSearchCV
?
What happens if you set
n_jobs=1
inTuneGridSearchCV
?
The same behaviour as a result...
Can you try cv=2
?
Can you try
cv=2
?
Sure. I got still the same behavior with n_jobs=1 and cv=2:
best_score, best_params = tsp.tune_grid_search(
selected_tunable_params,
flights_dataset,
n_jobs=1,
cv=2,
scoring="accuracy",
verbose=2,
use_gpu=False)
My pip list within the virtual environment:
Package Version Location
------------------------------ ----------- -------------------------------------------------------
aiohttp 3.7.3
aiohttp-cors 0.7.0
aioredis 1.3.1
alabaster 0.7.12
argon2-cffi 20.1.0
astroid 2.4.2
async-generator 1.10
async-timeout 3.0.1
atomicwrites 1.4.0
attrs 20.3.0
autopep8 1.5.4
Babel 2.9.0
backcall 0.2.0
beautifulsoup4 4.9.3
bleach 3.2.1
blessings 1.7
bump2version 1.0.1
bumpversion 0.6.0
cachetools 4.1.1
certifi 2020.11.8
cffi 1.14.4
chardet 3.0.4
click 7.1.2
colorama 0.4.4
colorful 0.5.4
commonmark 0.9.1
coverage 5.3
cycler 0.10.0
dataclasses 0.6
decorator 4.4.2
defusedxml 0.6.0
docutils 0.16
entrypoints 0.3
filelock 3.0.12
flights-time-series-dataset 1.0.0
future 0.18.2
google 3.0.0
google-api-core 1.23.0
google-auth 1.23.0
googleapis-common-protos 1.52.0
gpustat 0.6.0
grpcio 1.33.2
hiredis 1.1.0
idna 2.10
imagesize 1.2.0
importlib-metadata 3.1.0
iniconfig 1.1.1
ipykernel 5.3.4
ipython 7.19.0
ipython-genutils 0.2.0
isort 5.6.4
jedi 0.17.2
Jinja2 2.11.2
joblib 0.17.0
json5 0.9.5
jsonschema 3.2.0
jupyter-client 6.1.7
jupyter-core 4.7.0
jupyterlab 2.2.9
jupyterlab-pygments 0.1.2
jupyterlab-server 1.2.0
keyring 21.5.0
kiwisolver 1.3.1
lazy-object-proxy 1.4.3
lxml 4.6.2
MarkupSafe 1.1.1
matplotlib 3.3.3
mccabe 0.6.1
mistune 0.8.4
msgpack 1.0.0
multidict 5.0.2
nbclient 0.5.1
nbconvert 6.0.7
nbformat 5.0.8
nbsphinx 0.8.0
nest-asyncio 1.4.3
notebook 6.1.5
numpy 1.19.0
nvidia-ml-py3 7.352.0
opencensus 0.7.11
opencensus-context 0.1.2
oze-dataset 1.0.0
packaging 20.7
pandas 1.1.4
pandocfilters 1.4.3
parameterized 0.7.4
parso 0.7.1
pickleshare 0.7.5
Pillow 8.0.1
pip 20.2.4
pip-tools 5.4.0
pkginfo 1.6.1
pluggy 0.13.1
prometheus-client 0.9.0
prompt-toolkit 3.0.8
protobuf 3.14.0
psutil 5.7.3
py 1.9.0
py-spy 0.3.3
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycodestyle 2.6.0
pycparser 2.20
Pygments 2.7.2
pylint 2.6.0
pyparsing 2.4.7
pyrsistent 0.17.3
pytest 6.1.2
pytest-cov 2.10.1
python-dateutil 2.8.1
python-dotenv 0.15.0
pytz 2020.4
pywin32 300
pywin32-ctypes 0.2.0
pywinpty 0.5.7
PyYAML 5.3.1
pyzmq 20.0.0
ray 1.0.1.post1
readme-renderer 28.0
recommonmark 0.6.0
redis 3.4.1
requests 2.25.0
requests-toolbelt 0.9.1
rfc3986 1.4.0
rsa 4.6
rstcheck 3.3.1
scikit-learn 0.23.2
scipy 1.5.4
seaborn 0.11.0
Send2Trash 1.5.0
setuptools 41.2.0
six 1.15.0
sklearn 0.0
skorch 0.9.0
snowballstemmer 2.0.0
soupsieve 2.0.1
Sphinx 3.3.1
sphinx-autodoc-typehints 1.11.1
sphinx-rtd-theme 0.5.0
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 1.0.3
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.4
sphinxcontrib-svg2pdfconverter 1.1.0
tabulate 0.8.7
tensorboardX 2.1
terminado 0.9.1
testpath 0.4.4
threadpoolctl 2.1.0
time-series-dataset 0.0.2
time-series-models 0.1.1
time-series-predictor 2.2.0 c:\users\daniel\workspaces\python\time_series_predictor
toml 0.10.2
torch 1.7.0+cu110
tornado 6.1
tqdm 4.54.0
traitlets 5.0.5
tune-sklearn 0.1.0
twine 3.2.0
typed-ast 1.4.1
typing-extensions 3.7.4.3
urllib3 1.26.2
wcwidth 0.2.5
webencodings 0.5.1
wheel 0.35.1
wrapt 1.12.1
yarl 1.6.3
zipp 3.4.0
Can you try updating tune-sklearn to the version on github? pip install -U git+https://github.com/ray-project/tune-sklearn.git
And also please make sure that your Ray version is up to date.
Can you try updating tune-sklearn to the version on github?
pip install -U git+https://github.com/ray-project/tune-sklearn.git
And also please make sure that your Ray version is up to date.
After I updated with the command above, it went to version 0.0.8. Now the test crashes with the following info:
Windows fatal exception: stack overflow
Thread 0x00003438 (most recent call first):
File "C:\Python37\lib\threading.py", line 300 in wait
File "C:\Python37\lib\threading.py", line 552 in wait
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd.py", line 232 in _on_run
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_daemon_thread.py", line 46 in run
File "C:\Python37\lib\threading.py", line 926 in _bootstrap_inner
File "C:\Python37\lib\threading.py", line 890 in _bootstrap
Thread 0x00004a5c (most recent call first):
File "C:\Python37\lib\threading.py", line 300 in wait
File "C:\Python37\lib\threading.py", line 552 in wait
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\pydevd.py", line 186 in _on_run
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_daemon_thread.py", line 46 in run
File "C:\Python37\lib\threading.py", line 926 in _bootstrap_inner
File "C:\Python37\lib\threading.py", line 890 in _bootstrap
Thread 0x00005e20 (most recent call first):
File "C:\Python37\lib\threading.py", line 296 in wait
File "C:\Python37\lib\threading.py", line 552 in wait
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_timeout.py", line 43 in _on_run
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_daemon_thread.py", line 46 in run
File "C:\Python37\lib\threading.py", line 926 in _bootstrap_inner
File "C:\Python37\lib\threading.py", line 890 in _bootstrap
Thread 0x000067e4 (most recent call first):
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 210 in _read_line
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 228 in _on_run
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_daemon_thread.py", line 46 in run
File "C:\Python37\lib\threading.py", line 926 in _bootstrap_inner
File "C:\Python37\lib\threading.py", line 890 in _bootstrap
Thread 0x0000337c (most recent call first):
File "C:\Python37\lib\threading.py", line 300 in wait
File "C:\Python37\lib\queue.py", line 179 in get
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_comm.py", line 339 in _on_run
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_daemon_thread.py", line 46 in run
File "C:\Python37\lib\threading.py", line 926 in _bootstrap_inner
File "C:\Python37\lib\threading.py", line 890 in _bootstrap
Current thread 0x00005d54 (most recent call first):
File "c:\Users\Daniel\.vscode\extensions\ms-python.python-2020.11.371526539\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_trace_dispatch_regular.py", line 364 in __call__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 335 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
File "C:\Python37\lib\pprint.py", line 393 in _repr
File "C:\Python37\lib\pprint.py", line 161 in _format
File "C:\Python37\lib\pprint.py", line 144 in pformat
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\time_series_predictor\sklearn\base.py", line 281 in __repr__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 437 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
File "C:\Python37\lib\pprint.py", line 393 in _repr
File "C:\Python37\lib\pprint.py", line 161 in _format
File "C:\Python37\lib\pprint.py", line 144 in pformat
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\time_series_predictor\sklearn\base.py", line 281 in __repr__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 437 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
File "C:\Python37\lib\pprint.py", line 393 in _repr
File "C:\Python37\lib\pprint.py", line 161 in _format
File "C:\Python37\lib\pprint.py", line 144 in pformat
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\time_series_predictor\sklearn\base.py", line 281 in __repr__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 437 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
File "C:\Python37\lib\pprint.py", line 393 in _repr
File "C:\Python37\lib\pprint.py", line 161 in _format
File "C:\Python37\lib\pprint.py", line 144 in pformat
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\time_series_predictor\sklearn\base.py", line 281 in __repr__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 437 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
File "C:\Python37\lib\pprint.py", line 393 in _repr
File "C:\Python37\lib\pprint.py", line 161 in _format
File "C:\Python37\lib\pprint.py", line 144 in pformat
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\time_series_predictor\sklearn\base.py", line 281 in __repr__
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 437 in _safe_repr
File "c:\Users\Daniel\Workspaces\Python\time_series_predictor\.env\lib\site-packages\sklearn\utils\_pprint.py", line 172 in format
I am using ray
version 1.0.1.post1
That's quite odd. @richardliaw, @inventormc any ideas?
You can revert back to the previous version by u installing tune-sklearn and installing it normally again.
I reinstalled tune-sklearn
. It got the version tune-sklearn-0.1.0. The behaviours is now the previous I reported here.
Hey @DanielAtKrypton, what are the commands to reproduce your stack?
Also, can you try running this outside vscode (i.e., just using a terminal)?
Hey @DanielAtKrypton, what are the commands to reproduce your stack?
Also, can you try running this outside vscode (i.e., just using a terminal)?
Sure, I just started the test:
I will leave it processing for now...
can you try instead with pytest -s -v
?
can you try instead with
pytest -s -v
?
Sure. Here is the output:
OK got it; can you now try, in a python terminal:
import ray
ray.init()
@ray.remote
def hello_world():
print("hi")
return "hi"
print(ray.get(hello_world.remote()))
There you go:
awesome, so now we know that the fundamental problem seems to be in ray core.
Can you try ray stop
and run it again?
Still running. I will update as soon I have other output from the terminal...
OK got it, so ray.init()
is just hanging forever?
OK got it, so
ray.init()
is just hanging forever?
Yes, unfortunately it is.
OK. Can you try:
pip install -U [latest wheel link for windows] as found here:
https://docs.ray.io/en/master/installation.html#daily-releases-nightlies
and if that doesn't work, try downgrading to pip install ray==1.0.0
?
I installed the latest wheel for windows and python 3.7.
Now it is behaving like this:
I tried to open the dashboard in my browser but the browser was unable to connect there.
And ray status reports:
try ray stop
a couple times, then try the hello world again?
try
ray stop
a couple times, then try the hello world again?
I tried a couple times. It starts and hangs forever...
Unfortunately this is a ray issue, and I'll close this and continue discussion on the ray side.
Source of this problem is being considered here.
Problem description
I am setting up a test framework for
ray tune
but unfortunately I got stuck when I was trying to tune the learning rate hyperparameter of a pipelined network.The test code can be found here.
I noticed when debugging the test that the tuning spawns many threads as can be seen at the call stack to the left and below:
Despite I have already installed
gpustat
by runningpip install gpustat
, there is stihl a message warning me to install it. These threads stay open for hours and there is no other feedback in the terminal.Is there anything I am missing to make the
learning rate
hyperparameter tuning work smoothly here?Environment information
Vs Code
Python dependencies:
requiremets lock