tableau / TabPy

Execute Python code on the fly and display results in Tableau visualizations:
https://tableau.github.io/TabPy/
MIT License
1.55k stars 592 forks source link

Heroku model deployments: os.path.isfile(path_to_pickle) returns true, but client.deploy(...) returns FileNotFoundError #603

Open Tor-Saxberg opened 1 year ago

Tor-Saxberg commented 1 year ago

Environment information:

Describe the issue I have Tabpy running on a Heroku server, but no models are deployed on launch. I tried to deploy a model like this:

heroku run python3
>>> from tabpy.tabpy_tools.client import Client
>>> client = Client('https://xxx.herokuapp.com/')
>>> client.set_credentials('xxx','xxx')
>>> def anova(_arg1, _arg2, *_argN):
...
>>> client.deploy('anova', anova, 'computes anova  p-value between several groups')

But this returns the below error message:

Overwriting existing file "/app/tabpy/tabpy_server/staging/endpoints/anova/1" when saving query object
Error with server response. code=500; text={"message": "error adding endpoint", "info": "FileNotFoundError : [Errno 2] No such file or directory: '/app/tabpy/tabpy_server/staging/endpoints/anova/1'"}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/app/tabpy/tabpy_tools/client.py", line 244, in deploy
    self._service.add_endpoint(Endpoint(**obj))
  File "/app/tabpy/tabpy_tools/rest_client.py", line 203, in add_endpoint
    return self.service_client.POST("endpoints", endpoint.to_json())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/tabpy/tabpy_tools/rest.py", line 199, in POST
    return self.network_wrapper.POST(self.endpoint + url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/tabpy/tabpy_tools/rest.py", line 110, in POST
    self.raise_error(response)
  File "/app/tabpy/tabpy_tools/rest.py", line 62, in raise_error
    raise ResponseError(response)
tabpy.tabpy_tools.rest.ResponseError: (500) error adding endpoint FileNotFoundError : [Errno 2] No such file or directory: '/app/tabpy/tabpy_server/staging/endpoints/anova/1'

Steps to reproduce I clicked on the tabpy deploy button, cloned a tabpy repository into the (empty) https://git.heroku.com/xxx.git, and ran the above code.

Expected behavior The model would be deployed, show as deployed in https://xxx.herokuapp.com/, and function properly in Tableau Cloud.

Additional context I also ran this for context:

>>> os.path.isfile('/app/tabpy/tabpy_server/staging/endpoints/anova/1/pickle_archive') # returns True
>>> quit()
heroku run ls tabpy/tabpy_server/staging/endpoints/anova/1 -a xxx # returns: No such file or directory

Then I tried this, following the code in tabpy/models/deploy_models.py:

>>> import os
>>> from pathlib import Path
>>> import platform
>>> import subprocess
>>> import sys
>>> from tabpy.models.utils import setup_utils
>>> py = "python3"
>>> file_path = setup_utils.get_default_config_file_path() # Using config file at /app/tabpy/tabpy_server/common/default.conf
>>> port, auth_on, prefix = setup_utils.parse_config(file_path) # 9004, True, 'http'
>>> auth_args = setup_utils.get_creds()
Username:
Password:
>>> directory = 'tabpy/models/scripts'
>>> for filename in os.listdir(directory):
...     subprocess.run([py, f"{directory}/{filename}", file_path] + auth_args)

which returned this error for each model:

Traceback (most recent call last):
  File "/app/.heroku/python/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.heroku/python/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/app/.heroku/python/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
...
During handling of the above exception, another exception occurred:
...
File "/app/.heroku/python/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=9004): Max retries exceeded with url: /endpoints (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3033cc4250>: Failed to establish a new connection: [Errno 111] Connection refused'))

on local machine (not heroku), tabpy-deploy-model succesfully creates the directory /opt/homebrew/lib/python3.10/site-packages/tabpy/tabpy_server/staging/endpoints... So I tried deploying to the heroku instance without entering the heroku shell:

# in github clone directory on local machine
>>> from tabpy.tabpy_tools.client import Client
>>> client = Client('https://xxx.herokuapp.com/')
>>> client.set_credentials('xxx','xxx')
>>> def anova(_arg1, _arg2, *_argN):
...
>>> client.deploy('anova', anova, 'computes anova  p-value between several groups')

but this returns an error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/.../tabpy/tabpy_tools/client.py", line 241, in deploy
    self._upload_endpoint(obj)
  File "/Users/.../tabpy/tabpy_tools/client.py", line 334, in _upload_endpoint
    endpoint_obj.save(obj["src_path"])
  File "/Users/.../tabpy/tabpy_tools/query_object.py", line 55, in save
    self._save_local(path)
  File "/Users/.../tabpy/tabpy_tools/query_object.py", line 61, in _save_local
    os.makedirs(path)
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 215, in makedirs
    makedirs(head, exist_ok=exist_ok)
  [Previous line repeated 3 more times]
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/os.py", line 225, in makedirs
    mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/app'

Lastly, I noticed in deploy-to-heroku.md that the default port should be set to 443, but port, auth_on, prefix = setup_utils.parse_config(file_path) shows that the port is actually 9004. Furthermore, when the app is launched in Heroku, it fills out the procfile as export TABPY_PORT=#random number# && export TABPY_PWD_FILE=./file.txt && tabpy-user add -u xxx -p xxx -f ./file.txt && tabpy . So, is this a port-mismatch error?

Tor-Saxberg commented 1 year ago

I found this comment on stackoverflow: "You can only deploy functions directly from Tabpy server. remote deployment is not possible." But I also found that Feature Request: Ability to remotely publish code #64 was closed as "completed."

I'm not sure how I would use Tabpy + Heroku if there was no way to use even the predeployed functions , so is there another way to deploy models?

Tor-Saxberg commented 1 year ago

I tried to just copy the entire 'tabpy/tabpy_server/staging/' directory like this: cp -r /opt/homebrew/lib/python3.10/site-packages/tabpy/tabpy_server/staging/endpoints tabpy/tabpy_server/staging/endpoints , but that gives the following error (after pushing):

>>> from tabpy.tabpy_tools.client import Client
>>> client = Client('https://xxx.herokuapp.com/')
>>> client.set_credentials('xxx','xxx')
>>> def anova(_arg1, _arg2, *_argN):
...
>>> client.deploy('anova', anova, 'computes anova  p-value between several groups', override=True)
Overwriting existing file "/app/tabpy/tabpy_server/staging/endpoints/anova/1" when saving query object
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/app/tabpy/tabpy_tools/client.py", line 248, in deploy
    self._wait_for_endpoint_deployment(obj["name"], obj["version"])
  File "/app/tabpy/tabpy_tools/client.py", line 361, in _wait_for_endpoint_deployment
    raise RuntimeError(f'LoadFailed: {ep["last_error"]}')
RuntimeError: LoadFailed: Load failed: code() argument 13 must be str, not int

Also, I noticed that endpoints/anova/1/ was not the latest version among the endpoints/anova/ directories, which suggests that the get_endpoints method in the Client class in tabpy/tabpy_tools/Client.py is not reading it correctly or something.