torchlambda lacks non-source files in distributed package #2

Closed streicherlouw closed 4 years ago

streicherlouw commented 4 years ago


I have been trying hard to follow the torchlambda example, but have simply come up against a hard stop that I cannot bridge. If you can offer any help, it would be much appreciated.

My installation is as such: I am working in a conda environment called fastai2, with torch 1.5.0 (full pip list below) and docker version 19.03.8.

I installed torchlambda with "pip3 install torchlambda". (I have also tried with "pip install --user torchlambda", which fails when executing in exactly the same later later)

(fastai2) streicher@MLPC:~$ pip3 install torchlambda
Collecting torchlambda
  Using cached torchlambda-1590904228-py3-none-any.whl (30 kB)
Requirement already satisfied: Cerberus>=1.3.2 in ./.local/lib/python3.7/site-packages (from torchlambda) (1.3.2)
Requirement already satisfied: PyYAML>=5.3 in /home/linuxbrew/.linuxbrew/lib/python3.7/site-packages (from torchlambda) (5.3.1)
Requirement already satisfied: setuptools in /home/linuxbrew/.linuxbrew/lib/python3.7/site-packages (from Cerberus>=1.3.2->torchlambda) (46.0.0)
Installing collected packages: torchlambda
Successfully installed torchlambda-1590904228
(fastai2) streicher@MLPC:~$ 

I then created the model as described on github:

(fastai2) streicher@MLPC:~$ python
Python 3.7.7 (default, Mar 26 2020, 15:48:22) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import torchvision
>>> model = torchvision.models.resnet18()
>>> torch.jit.script(model).save("model.ptc")
>>> quit()
(fastai2) streicher@MLPC:~$ ls *.ptc

Here starts the problems..

(fastai2) streicher@MLPC:~$ torchlambda settings
torchlambda:: Started creating YAML settings at /home/streicher/torchlambda.yaml.
torchlambda:: Started copying YAML source code
cp: cannot stat './templates/settings/torchlambda.yaml': No such file or directory
torchlambda:: Error: Failed during copying YAML source code
(fastai2) streicher@MLPC:~$ stat ./templates/settings/torchlambda.yaml
stat: cannot stat './templates/settings/torchlambda.yaml': No such file or directory

So I create the file manually using the example

(fastai2) streicher@MLPC:~$ vi torchlambda.yaml
(fastai2) streicher@MLPC:~$ cat torchlambda.yaml 
  shape: [1, 3, width, height]
  type: byte
  cast: float
  divide: 255
  means: [0.485, 0.456, 0.406]
  stddevs: [0.229, 0.224, 0.225]
    operations: argmax
    type: int
    name: label
    item: true
(fastai2) streicher@MLPC:~$ 

Here comes the hard problem... I then try the template step, but it fails in a way that I cannot get around

(fastai2) streicher@MLPC:~$ torchlambda template --yaml torchlambda.yaml
torchlambda:: Started creating C++ scheme at ./torchlambda.
torchlambda:: Started reading YAML settings.
torchlambda:: Finished reading YAML settings.
torchlambda:: Started validating YAML settings.
torchlambda:: Error during YAML validation:
- means:
  - 'means field''s shape is not broadcastable to provided input shape: [1, 3, ''width'',
  - 'stddevs field''s shape is not broadcastable to provided input shape: [1, 3, ''width'',
(fastai2) streicher@MLPC:~$ 

Do you have any idea what I am doing wrong? I would really love to give tochlambda a try. I currently use the pytorch 1.1 ARN layer in lambda, and it takes fully 18 seconds to start up from cold...


Additional information:

My docker seems to be ok:

(fastai2) streicher@MLPC:~$ docker pull szymonmaszke/torchlambda:v1.5.0
v1.5.0: Pulling from szymonmaszke/torchlambda
a3f8e652bdc4: Pull complete 
8f876fde9c06: Pull complete 
cd0ee434189a: Pull complete 
Digest: sha256:c657e856e0b3f01cd7a8c3b3f603f94b0f45541ad8851afa056c988aaf59efa5
Status: Downloaded newer image for szymonmaszke/torchlambda:v1.5.0

(fastai2) streicher@MLPC:~/torchlambda$ docker image list
REPOSITORY                 TAG                                   IMAGE ID            CREATED             SIZE
szymonmaszke/torchlambda   v1.5.0                                d2496183b359        20 hours ago        711MB
samcli/lambda              python3.6-355e75366d87dabec462ddf97   ac2f11a96ba6        3 weeks ago         991MB
lambci/lambda              python3.6                             46852491e8e1        3 weeks ago         882MB
ubuntu                     latest                                1d622ef86b13        5 weeks ago         73.9MB
hello-world                latest                                bf756fb1ae65        5 months ago        13.3kB

(fastai2) streicher@MLPC:~$ docker run szymonmaszke/torchlambda
torchlambda:: Building AWS Lambda .zip package.
torchlambda:: Compilation flags: 
torchlambda:: Final build arguments: -DBUILD_SHARED_LIBS=OFF -DAWS_COMPONENTS=
-- The CXX compiler identification is GNU 7.3.1
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Torch: /usr/local/lib/libtorch.a  
-- Found CURL: /usr/lib64/ (found version "7.61.1") 
-- Found AWS SDK for C++, Version: 1.7.344, Install Root:/usr/local, Platform Prefix:, Platform Dependent Libraries: pthread;crypto;ssl;z;curl
-- Configuring done
-- Build files have been written to: /usr/local/build
CMake Error at CMakeLists.txt:34 (add_executable):
  No SOURCES given to target: torchlambda
torchlambda:: App size:
du: cannot access '/usr/local/build/torchlambda': No such file or directory
torchlambda:: Zipped app size:
du: cannot access '/usr/local/build/': No such file or directory
torchlambda:: Deployment finished successfully.(fastai2) streicher@MLPC:~$ 

And the pip installation sees to be ok:

szymonmaszke commented 4 years ago

Hi, thanks for the report, I've confirmed the bugs you are talking about. Will hit you up as soon as the problem is fixed.

szymonmaszke commented 4 years ago

@streicherlouw fixed, reinstall torchlambda, both settings and template subcommands should work fine now, please report whether it's fine now on your side as well.

streicherlouw commented 4 years ago

Thank you for your quick response, I can confirm that the scripts work perfectly now :-)