Hydrospheredata / hydro-serving

MLOps Platform
http://docs.hydrosphere.io
Apache License 2.0
271 stars 42 forks source link

Documentation updates #216

Closed casualdan closed 5 years ago

casualdan commented 5 years ago

I have followed your get started guide and come up with several problems that I guess should be updated in documentation:

ERROR:root:Exception calling application: No module named 'numpy' Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/grpc/_server.py", line 376, in _call_behavior return behavior(argument, context), True File "/app/src/PythonRuntimeService.py", line 36, in Predict module = importlib.import_module("func_main") File "/usr/local/lib/python3.6/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 994, in _gcd_import File "", line 971, in _find_and_load File "", line 955, in _find_and_load_unlocked File "", line 665, in _load_unlocked File "", line 678, in exec_module File "", line 219, in _call_with_frames_removed File "/model/files/src/func_main.py", line 2, in import numpy as np ModuleNotFoundError: No module named 'numpy'


That is very strange, because "requirements.txt" in "serving.yaml" includes `numpy==1.13.3`.
tidylobster commented 5 years ago

Hi, I'm already prepared documentation updates, just haven't released them yet. I will polish some moments and release updates soon. If you want, you can check pull request yourself: https://github.com/Hydrospheredata/hydro-serving/pull/212

casualdan commented 5 years ago

I've just checked this pull request and it absolutely resolved first two problems. But the third one is still there with the same logs provided.

tidylobster commented 5 years ago

@kuzmind Can you provide your serving.yaml?

casualdan commented 5 years ago

Sure, last time I used one from updated "getting-started.md" file:

kind: Model
name: linear_regression
model-type: python:3.6
payload:
  - "src/"
  - "requirements.txt"
  - "model.h5"
contract:
  infer:
    inputs:
      x:
        shape: [-1, 2]
        type: double
        profile: numerical
    outputs:
      y:
        shape: [-1]
        type: double
        profile: numerical
tidylobster commented 5 years ago

@kuzmind Everything seems fine. I've just reproduced all steps myself and it worked out. I've also updated docs. Can you reassure, that you've done everything correctly?

casualdan commented 5 years ago

I am running hydro-serving under proxy server and I think the problem is that hydrosphere/serving-runtime-python:3.6-latest can't install modules from "requirements.txt" using pip. I got similar problems with docker before, but now such command work fine. Do you have any ideas how to fix it? I think writing my own runtime with additional ENV instructions in Dockerfile should work, but it isn't very convenient.

Here are some logs provided during hs upload command:

manager      | [2018-11-01 16:45:08.672][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: Step 6/7 : RUN ls /model/files/requirements.txt && pip install -r /model/files/requirements.txt --target /model/lib || echo "no requirements"
manager      | [2018-11-01 16:45:08.672][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: 
manager      | 
manager      | [2018-11-01 16:45:08.792][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:  ---> Running in ecedd7040021
manager      | 
manager      | [2018-11-01 16:45:09.164][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: /model/files/requirements.txt
manager      | 
manager      | [2018-11-01 16:45:10.088][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: Collecting keras==2.2.0 (from -r /model/files/requirements.txt (line 1))
manager      | 
manager      | [2018-11-01 16:45:10.444][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/keras/
manager      | 
manager      | [2018-11-01 16:45:11.011][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/keras/
manager      | 
manager      | [2018-11-01 16:45:12.080][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/keras/
manager      | 
manager      | [2018-11-01 16:45:14.148][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/keras/
manager      | 
manager      | [2018-11-01 16:45:18.213][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/keras/
manager      | 
manager      | [2018-11-01 16:45:18.280][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Could not fetch URL https://pypi.org/simple/keras/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/keras/ (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)) - skipping
manager      | 
manager      | [2018-11-01 16:45:18.282][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12:   Could not find a version that satisfies the requirement keras==2.2.0 (from -r /model/files/requirements.txt (line 1)) (from versions: )
manager      | 
manager      | [2018-11-01 16:45:18.284][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: No matching distribution found for keras==2.2.0 (from -r /model/files/requirements.txt (line 1))
manager      | 
manager      | [2018-11-01 16:45:18.348][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)) - skipping
manager      | 
manager      | [2018-11-01 16:45:18.383][INFO][pool-24-thread-1] i.h.s.m.s.m.ModelBuildUpdater.log.16 Model build 12: no requirements
KineticCookie commented 5 years ago

Hello @kuzmind Seems like your proxy setup messes up pypi SSL. Can you share additional details about your deployment? In addition to that, please check if proxy allows requests topypi.org and files.pythonhosted.org.

casualdan commented 5 years ago

Hello @KineticCookie Now I am running through this tutorial exactly. I am getting HTTP/1.1 200 OK for curl -I https://pypi.prg and https://files.pythonhosted.org. As I understand "runtime" is docker in docker container and it can't resolve proxy correctly. While standard docker build looks up for additional configs in ~/.docker/config.json file.

KineticCookie commented 5 years ago

So, let me clear up what's happening. You uploaded a model and the manager tries to build it. Model build is basically docker run command which is called within manager container. There are no runtimes involved at this point.

Going deeper, the Dockerfile for model build contains the following line, RUN ls /model/files/requirements.txt && pip install -r /model/files/requirements.txt --target /model/lib || echo "no requirements" which runs pip install if there is a requirements.txt file.

So, the issue is somewhere in (pipy ----- proxy ----- manager) chain.

I need more information to identify the problem.

Could you please:

While standard docker build looks up for additional configs in ~/.docker/config.json file.

  1. Send additional configs from ~/.docker/config.json I got similar problems with docker before
  2. Clarify previous docker issues
  3. Clarify what proxy do you have and it's configs
  4. OS and docker versions

Thanks

casualdan commented 5 years ago

Model build is basically docker run command which is called within manager container. There are no runtimes involved at this point.

I meant that error occurs at the moment of building runtime container. And maybe the problem is that manager container doesn't provide proxy details for this build.

  1. Additional configs look like (I substitute real names):
    "proxies": {
        "default": {
            "httpProxy": "http://my.proxy.server.ru:port",
            "httpsProxy": "http://my.proxy.server.ru:port"
        }
    }

    This solution is described on docker official website.

  2. For example, simple Dockerfile:
    FROM python:3.6-slim
    RUN pip install flask

    didn't work before adding "proxies" with similar error as now during hs upload:

    Step 2/2 : RUN pip install flask
    ---> Running in 3ada18e6d16b
    Collecting flask
    Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/flask/
    Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/flask/
    Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/flask/
    Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/flask/
    Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)': /simple/flask/
    Could not fetch URL https://pypi.org/simple/flask/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/flask/ (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)) - skipping
    Could not find a version that satisfies the requirement flask (from versions: )
    No matching distribution found for flask
    Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:847)'),)) - skipping
    The command '/bin/sh -c pip install flask' returned a non-zero code: 1
  3. What do you mean?
  4. Docker and OS versions:
    • Docker version 18.06.1-ce, build e68fc7a
    • Ubuntu 16.04.5 LTS
KineticCookie commented 5 years ago

Apparently like our docker-client can't set up proxy settings in build containers yet. I'll create new issue about this problem and move your info there.