slaypni / fastdtw

A Python implementation of FastDTW
MIT License
785 stars 122 forks source link

Very slow execution in Docker #64

Open allComputableThings opened 3 months ago

allComputableThings commented 3 months ago

The following runs 28x slow in Docker (~28s) than on its host (~1s). I'm pretty sure something is not being compiled correctly. Both the host use the same version of python (3.10), numpy, and fastdtw. However, the docker version somehow makes many more calls to python's builtin min function -- I assume a mistake in how fastdtw is being compiled?

#fdtw.py
# Runs in 1s on host
# Runs in 28s in Docker
import numpy as np
from fastdtw import fastdtw
import time

a = np.sin(np.arange(1000))
b = np.cos(np.arange(3000))

t = time.time()
for i in range(100):
    fastdtw(a,b)
print(time.time()-t)

More details and perf logs:

https://stackoverflow.com/questions/78599924/how-to-diagnose-an-28x-slowdown-in-containerized-vs-host-pythonnumpy-execution

Dockerfile:

FROM python:3.10-slim
RUN apt update
RUN apt-get -y install nano build-essential software-properties-common libpng-dev
RUN apt-get -y install libopenblas-dev libopenblas64-0-openmp

RUN apt-get -y install gfortran liblapack3 liblapack-dev

#    libatlas-base-dev libatlas-base-dev
#     libblas3  libblas-dev

RUN pip3 install numpy==1.22.4 fastdtw

COPY /server /

RUN python -c 'import numpy ; print(numpy.__version__)'
RUN python -c 'import numpy ; numpy.show_config()'

RUN python -m cProfile  -s cumtime /server/fdtw.py > log.txt
RUN cat log.txt | head -500

RUN exit 1

Any idea why the big slowdown?

allComputableThings commented 3 months ago

I seems that on host, pip install fastdtw installs:

_fastdtw.cpython-310-x86_64-linux-gnu.so fastdtw.py __init__.py __pycache__

but in the Docker version, just:

fastdtw.py __init__.py __pycache__

and the initpy silently fails to import _fastdtw.

Why would the contain not have _fastdtw.cpython-310-x86_64-linux-gnu.so?

allComputableThings commented 3 months ago

pip3 install cython before installing fastdtw solves this.

Could cython be added as a project dependency?