Closed Lumik7 closed 6 years ago
@MoBran encountered the same issue
Update: Tried to run it with one thread, but get the same error as before My Versions of the required packages are the following:
python 3.5.4
Cython 0.27.3
fastdtw 0.3.2
numpy 1.13.3 (via pip)
numpy 1.13.1 (via conda)
psutil 5.4.1
I'm running Windows 10. I tried installing it with Cython and without it.
Update: I just saw that I have two numpy versions installed in my environment. numpy 1.13.1
was installed via conda and is a dependency for pandas, sklearn, ... and numpy 1.13.3
was installed via pip when installing fastdtw.
Update2: I used only the numpy version 1.13.1 and still got the same error.
By putting some print statements into the numpy.linalg.norm
function I found that the value that is handed to the function is an scalar ndarray and has therefore no axis argument which raises the ValueError("Improper number of dimensions to norm.")
in numpy.linalg.norm
.
Tested against the latest version on master and with all currently available data, still can't reproduce. Next step (will do today): Will compare versions of installed modules.
I created a fresh conda environment with python 3.5.4 and installed all requirements again, but it still fails. These are all packages:
alabaster==0.7.10
asn1crypto==0.22.0
awscli==1.14.18
Babel==2.5.1
backports.functools-lru-cache==1.4
botocore==1.8.22
certifi==2017.11.5
cffi==1.11.2
chardet==3.0.4
click==6.7
colorama==0.3.9
coverage==4.4.2
cryptography==2.1.4
cycler==0.10.0
Cython==0.27.3
decorator==4.1.2
docutils==0.14
fastdtw==0.3.2
flake8==3.5.0
future==0.16.0
gmplot==1.2.0
idna==2.6
imagesize==0.7.1
Jinja2==2.10
jmespath==0.9.3
lxml==4.1.1
MarkupSafe==1.0
matplotlib==2.1.1
mccabe==0.6.1
numpy==1.13.1
pandas==0.22.0
patsy==0.4.1
plotly==2.2.3
psutil==5.4.2
pyasn1==0.4.2
pycodestyle==2.3.1
pycparser==2.18
pyflakes==1.6.0
Pygments==2.2.0
pyOpenSSL==17.2.0
pyparsing==2.2.0
PySocks==1.6.7
python-dateutil==2.6.1
python-dotenv==0.7.1
pyts==0.5
pytz==2017.3
PyYAML==3.12
requests==2.18.4
rsa==3.4.2
s3transfer==0.1.12
scikit-learn==0.19.0
scipy==0.19.1
seaborn==0.8.1
six==1.11.0
sklearn==0.0
snowballstemmer==1.2.1
sobol-seq==0.1.2
Sphinx==1.6.5
sphinxcontrib-websupport==1.0.1
statsmodels==0.8.0
tornado==4.5.2
urllib3==1.22
win-inet-pton==1.0.1
wincertstore==0.2
If the bug is because of the requirements we may have to use distutils to make sure we have the same setup or we use docker.
@Lumik7 @MoBran I ran the latest version on master within a virtual environment configured with exactly those dependencies (pip install -r ...
). Still couldn't reproduce the bug. Could you try to run it with this configuration on your machines?
alabaster==0.7.10
awscli==1.14.2
Babel==2.5.1
bayesian-optimization==0.6.0
bleach==1.5.0
botocore==1.8.6
certifi==2017.11.5
chardet==3.0.4
click==6.7
colorama==0.3.7
colorlover==0.2.1
coranking==0.1.1
coverage==4.4.2
cycler==0.10.0
Cython==0.27.3
decorator==4.1.2
docutils==0.14
entrypoints==0.2.3
enum34==1.1.6
fastdtw==0.3.2
flake8==3.5.0
future==0.16.0
gmplot==1.2.0
hdbscan==0.8.11
html5lib==0.9999999
idna==2.6
imagesize==0.7.1
ipykernel==4.6.1
ipython==6.2.1
ipython-genutils==0.2.0
ipywidgets==7.0.3
jedi==0.11.0
Jinja2==2.9.6
jmespath==0.9.3
jsonschema==2.6.0
jupyter-client==5.1.0
jupyter-console==5.2.0
jupyter-core==4.3.0
lxml==4.1.1
Markdown==2.6.9
MarkupSafe==1.0
matplotlib==2.1.0
mccabe==0.6.1
mistune==0.7.4
nbconvert==5.3.1
nbformat==4.4.0
nose==1.3.7
notebook==5.2.0
numpy==1.13.3
pandas==0.21.0
pandocfilters==1.4.2
parso==0.1.0
pexpect==4.2.1
pickleshare==0.7.4
pkg-resources==0.0.0
plotly==2.2.3
prompt-toolkit==1.0.15
protobuf==3.5.0.post1
psutil==5.4.2
ptyprocess==0.5.2
py4j==0.10.4
pyasn1==0.4.2
pycodestyle==2.3.1
pyflakes==1.6.0
Pygments==2.2.0
pyparsing==2.2.0
pyspark==2.2.0
python-dateutil==2.6.1
python-dotenv==0.7.1
pyts==0.5
pytz==2017.3
PyYAML==3.12
pyzmq==16.0.2
qtconsole==4.3.1
requests==2.18.4
rsa==3.4.2
s3transfer==0.1.12
scikit-learn==0.19.0
scipy==0.19.1
seaborn==0.8.1
simplegeneric==0.8.1
six==1.11.0
sklearn==0.0
snowballstemmer==1.2.1
sobol-seq==0.1.2
Sphinx==1.6.5
sphinxcontrib-websupport==1.0.1
tensorflow==1.4.0
tensorflow-tensorboard==0.4.0rc3
terminado==0.6
testpath==0.3.1
tornado==4.5.2
traitlets==4.3.2
urllib3==1.22
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.12.2
widgetsnbextension==3.0.6
Alternative in case of persisting problems: pip install dtw
@Lumik7 @MoBran Please check the latest version in master and try to reproduce the error. I added a reshape of the vectors before passing them on to DTW that hopefully mitigates the problem.
I'm running it right now, how long does it take for all tokens?
On my laptop around 20 to 30 minutes (with multithreading), if I recall correctly. I'll check now.
Ok, it just finished, but it went way to fast, as the preprocessing steps alone take normally 10 minutes. I got a very small distance matrix as a result:
I noticed i forgot to change
tokens = [os.environ.get(alias) for alias in ["KEY_RAPHAEL"]] #, "KEY_MORITZ", "KEY_LUKAS"]]
back to
tokens = [os.environ.get(alias) for alias in ["KEY_RAPHAEL", "KEY_MORITZ", "KEY_LUKAS"]]
after testing before I pushed to master. Maybe that's still the case in the version you're running?
EDIT: Sorry, found it. Forgot to remove a code snippet for testing that only kept the first trip. I'll remove that and push again.
Version calculating all trips is on master now. Please check again.
Yeah I changed that before running it.
I also noticed that you put [:1]
here:
# 1. Get travel data per token, remove dataframes without annotations.
dfs = Preprocessor.replace_none_values_with_empty_dataframes(
# Drop dataframes w/o annotations.
Preprocessor._remove_dataframes_without_annotation(
# Get travel data per token.
Preprocessor.get_data_per_token(token)
)
)[:1]
I'm now running it with the previous version
Yeah, the [:1]
and the definition of tokens
are removed. Should process the entire dataset now. Sorry for not thinking of that.
Alright, but it ran through with the small set, so I guess thats a good sign :)
I guess so. It's weird that my version of the installed modules accepted the "malformed" vectors and your's didn't. But whatever, it seems to be solved now. I'll close this issue - please open it again if anything related comes up. I'll work a bit on the documentation next.
@rmitsch I get an exception when running dynamic time warping implemented in #21.
When running
python make_dataset.py --download False --preprocessing True
and setting in make_dataset.py the code to:I get the following error message for thread 1 - 6:
Which results in the
ValueError:Improper number of dimensions to norm.