holoviz / pyviz_comms

Bidirectional communication for the HoloViz ecosystem
BSD 3-Clause "New" or "Revised" License
32 stars 15 forks source link

Added try/except for retrieving library version #48

Closed julioasotodv closed 4 years ago

julioasotodv commented 4 years ago

In some weird environments, param's way of retrieving version information has many bugs (since it opens a subprocess to call git, for instance, which gives problems when Python processes are created with systems such as Supervisor or Gunicorn). This commit adds the option for the library to keep on going, even if the version cannot be correctly retreived.

This is just the way it is also done in https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/__init__.py#L46

jbednar commented 4 years ago

Should we not just be providing a version (no pun intended) of param.version.Version that already swallows such exceptions and returns the indicated version number? It doesn't seem like something that has to propagate to every project individually.

That said, I'd be interested to find out the cases under which it currently fails. At the least we ought to be protecting the call to git better.

julioasotodv commented 4 years ago

I believe that the version checking in param was just inherited from another project. But sure, let me come up with a minimal example with the error

jbednar commented 4 years ago

It wasn't inherited; it started in Param and then graduated to https://github.com/pyviz-dev/autover so that it would be available even to projects that don't use Param. So any change that's made to Param's Version should also be made to https://github.com/pyviz-dev/autover . But either way, the code was meant to be robust against failure to launch git, and clearly it isn't yet, which we should fix rather than work around...

julioasotodv commented 4 years ago

Here are my steps to reproduce the failure: I was basically trying to use panel with the Bokeh Server, but having the Server embedded in a ASGI app (with Starlette, but that's irrelevant here), managed by Gunicorn with Uvicorn workers.

In a nutshell, imagine I create a file called server.py with the following code:

import panel as pn

And now, I want to run my "server application" with Gunicorn + Uvicorn workers (I have tried a couple of recent versions of both, so you can just conda install gunicorn uvicorn):

gunicorn -k uvicorn.workers.UvicornWorker server

The traceback for the error is:

[2020-03-05 21:47:43 +0100] [3156] [INFO] Starting gunicorn 20.0.4
[2020-03-05 21:47:43 +0100] [3156] [INFO] Listening at: http://127.0.0.1:8000 (3156)
[2020-03-05 21:47:43 +0100] [3156] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2020-03-05 21:47:43 +0100] [3159] [INFO] Booting worker with pid: 3159

[2020-03-05 21:47:45 +0100] [3159] [ERROR] Exception in worker process
Traceback (most recent call last):
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
    worker.init_process()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/uvicorn/workers.py", line 57, in init_process
    super(UvicornWorker, self).init_process()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/workers/base.py", line 119, in init_process
    self.load_wsgi()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/workers/base.py", line 144, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 49, in load
    return self.load_wsgiapp()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 39, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/gunicorn/util.py", line 358, in import_app
    mod = importlib.import_module(module)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/julio/Documentos/Curso Visualización Bokeh y Plotly/Ejemplos/tornado_dashboard/server.py", line 1, in <module>
    import panel as pn
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/panel/__init__.py", line 22, in <module>
    fpath=__file__, archive_commit="$Format:%h$", reponame="panel"))
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 280, in __str__
    known_stale = self._known_stale()
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 223, in _known_stale
    commit = self.commit
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 133, in commit
    return self.fetch()._commit
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 162, in fetch
    self.git_fetch(cmd)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 212, in git_fetch
    self._update_from_vcs(output)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 260, in _update_from_vcs
    self._release = tuple(int(el) for el in dot_split)
  File "/home/julio/anaconda3/envs/bokeh_starlette_server/lib/python3.7/site-packages/param/version.py", line 260, in <genexpr>
    self._release = tuple(int(el) for el in dot_split)
ValueError: invalid literal for int() with base 10: ''
[2020-03-05 21:47:45 +0100] [3159] [INFO] Worker exiting (pid: 3159)
[2020-03-05 21:47:45 +0100] [3156] [INFO] Shutting down: Master
[2020-03-05 21:47:45 +0100] [3156] [INFO] Reason: Worker failed to boot.

The error happens in https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L247. It turns out that because of the way either Gunicorn or Uvicorn creates new python processes, for some reason the output argument in that method is an empty string '', which makes the function fail. I believe this happens since the function git_fetch in https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L169 has a value of its internal output variable of '', as the result of the run_cmd function (which uses subprocess.Popen()).

Indeed, I believe it would be a better idea to fix it at the module level.

julioasotodv commented 4 years ago

I am doing some experiments, and it looks like the problem has to do with Gunicorn/Uvicorn returning zero for a subprocess.Popen() call that should return nonzero exit code in https://github.com/holoviz/param/blob/master/param/version.py#L23, so chances are that param.version is not technicallly wrong...

I will try to update this issue as soon as I can.

ceball commented 4 years ago

But either way, the code was meant to be robust against failure to launch git, and clearly it isn't yet, which we should fix rather than work around...

I think packages* should not even be trying to launch git when imported. If I install a python package (e.g. via conda or pip) and I import it, and as a side effect, "git" is run...that's very surprising to me.

Not only is that surprising, it can cause various "fatal" problems (e.g. as reported in this issue, or variants of https://github.com/pyviz-dev/autover/issues/58 like security prompts or virus checker complaints, etc etc) or various "confusing" problems (e.g. incorrect version info being reported; https://github.com/pyviz-dev/autover/issues/54).

(* edit: "package" is an overloaded term - I mean things installed by conda, pip, etc. I'm not concerned that git is run when I have a git checkout of param and import that.)

The team has had some internal "fundamental disagreements" about autover over the years :) However, it is really great in allowing up-to-date version info to appear all over the place automatically, and enables the wonderful ability to git tag and have packages magically appear. I hugely miss all that when working outside of the holoviz ecosystem in many places/ecosystems that don't have that kind of thing. So, if we could figure out how to fix this issue for packages, it would be really great.

ceball commented 4 years ago

Although I can't take part in how autover actually works (sorry), if anyone else is able to work on it, I'll very gladly contribute testing of changes after they are made. Including making any necessary adjustments to autover's testing, CI, etc. Please just ping me.

jbednar commented 4 years ago

I think packages* should not even be trying to launch git when imported. If I install a python package (e.g. via conda or pip) and I import it, and as a side effect, "git" is run...that's very surprising to me.

That's a good point; why is it even trying to run git at all? Shouldn't there be a .version file that is found, short-circuiting all the version-calculation machinery entirely? Something seems broken, unless the OP is testing something built directly from git and not via a package.

julioasotodv commented 4 years ago

Honestly, I have no idea what is the purpose of param.version...

ceball commented 4 years ago

However, it is really great in allowing up-to-date version info to appear all over the place automatically

As an aside, if someone were going to consider working on autover, I suppose it might also be worth considering whether we (holoviz) also need to address versioning beyond just python, or can happily ignore it/leave it as it is.

E.g. this repo (pyviz_comms) has a package.json with a version in it, for jupyterlab_pyviz. The versions of pyviz_comms and jupyterlab_pyviz are not the same (they don't need to be, and in general are unlikely to be the same), but there's some link between the versions (a range of one for the other). jupyterlab_pyviz is part of a different ecosystem (different tools, requires building, etc), so it's not clear that autover would ever have anything to do with it. But I thought it might be worth mentioning this topic since I have seen various problems from missing/wrongly reported/mismatched versions of the two sides (and the same for various similar packages - it's a general problem).

Similarly, pyctdev (for example) has never tried to help out with non-python stuff, but it could. If there were a Makefile for this project, it seems likely it would have targets for building the ts, uploading to npm, etc etc. Other projects presumably have these same problems, so it might be worth looking around to see what the landscape is like now in terms of tools available.

ceball commented 4 years ago

(Sorry for diverting the original issue. I should probably move to discourse. Maybe there should even be a pyviz-dev section, to help avoid cluttering/distracting all the holoviz projects with various "developer" issues common to all/many of the projects.)

philippjfr commented 4 years ago

I'm going to wade into controversial territory once again. There has been at least one other very annoying bug in autover/param.Version since it's inception. The promises of autover are great and a cleaner implementation than the alternatives is nice but if crucial things don't end up getting fixed then I'd honestly rather switch to a solution which is used by other packages with millions of users which doesn't add maintenance burden on us...

julioasotodv commented 4 years ago

Ok, so back to the original issue/PR I created in this thread. I have been investigating the error during the weekend and the gist of it is basically:

It turns out that param.version is running (or at least trying to run) some git commands in order to retrieve the version. The way the git commands are launched is through subprocess.Popen() as you can see here: https://github.com/holoviz/param/blob/7d2d2a848dc79d9741709702f8b47f87497701b0/param/version.py#L23

And well, it turns out that subprocess.Popen() is nor particularly async-friendly when spawned from an os fork instruction (just like Gunicorn does). Apparently, one of the problems is that it misreports the return code for the launched command.

The only solution I can think of would be to slightly modify the run_cmd function, to make it read the return code as either 0 if everything runs correctly; or nonzero if either the actual returncode read by subprocess.Popen() is nonzero or if the stderr is something other than an empty byte array.

I can create a PR in param/autover to test this. It would solve my error, but I am unsure if it is a desirable solution at all for the rest of the world... Perhaps we can see if it Travis likes it?

julioasotodv commented 4 years ago

Yes let's close this PR. I will open one at param, with a possible fix for the cmd issue.