Open simonw opened 1 year ago
Hi @simonw,
the app
argument should be the import location of your application, not the actual object. The actual load of the app will be handled by Granian itself.
You should be able to serve your application just writing a main.py
file with these contents:
from datasette.app import Datasette
ds = Datasette(memory=True)
app = ds.app()
and run Granian from cli with: granian --interface asgi main:app
Thanks - that recipe worked for me right now, and didn't return any errors.
However... I really want the ability to start Granian running from my own Python scripts, in the manner shown above.
The reason is that Datasette is configured by the command line. The usual way of starting it looks like this:
datasette serve mydb.db -p 8003 --setting sql_time_limit_ms 10000 --crossdb
There are a whole ton of options like those: https://docs.datasette.io/en/stable/cli-reference.html#datasette-serve
datasette serve
starts a server running using Uvicorn - but I also have a plugin for using gunicorn
instead which works like this:
# Install the plugin
datasette install datasette-gunicorn
datasette gunicorn mydb.db -p 8003 --setting sql_time_limit_ms 10000 --crossdb
I ran into the "can't pickle local object" error while trying to build a new plugin, datasette-granian
, which worked in a similar way to datasette-gunicorn
.
The implementation of datasette-gunicorn
is here: https://github.com/simonw/datasette-gunicorn/blob/0.1.1/datasette_gunicorn/__init__.py
So I guess this is a feature request: I would like a documented way to programatically start a Granian server without having to use the granian --interface asgi
CLI tool.
@simonw using Granian
interface directly is absolutely possible, but as I said, the app
argument shouldn't be your application instance, but a string with the importable location of your application object. It depends on how you named your module/package, but generally speaking should be something like yourmodule:app
.
Benchmarks apps in Granian repo can show you an example: https://github.com/emmett-framework/granian/blob/e5139e218606a1b7bac817d2cfa1975a0d23a40c/benchmarks/app/asgi.py#L76-L82
The challenge with that is that my application object needs to be instantiated with additional arguments that have been provided by the user.
One thing that might work is that I could code-generate a Python script file that instantiates an app
object with all of the user's command line options, save that script to /tmp
and then pass it to Granian(...)
to start running - but that feels pretty messy! I'd much rather be able to serve an ASGI app directly, like I can with Uvicorn and Gunicorn:
@simonw considering how granian is designed, is not possible to directly pass an application instance to the server, as it would require that every loaded object should be pickable due to multiprocessing
module usage.
Probably we can solve this with a --factory
option in Granian, so instead of pointing to the application, you can point to a function that does everything you need and return the application instance, WDYT?
That could work. I'm not sure how I'd pass in the additional arguments though.
The thing I want to build is effectively a CLI script that looks something like this:
python datasette_granian.py -h 127.0.0.1 -p 8005 --setting page_size 5
(Plus a whole bunch more options).
Running this script would instantiate my existing Datasette()
class with various options, then use ds.app()
to get an ASGI app and start serving that using Granian.
@simonw wait, I got an idea to get this working with the current implementation.
The serve
method of Granian
class actually accepts a target_loader
parameter: https://github.com/emmett-framework/granian/blob/9265a03416a536acb37ca6073292a0c88b23ab47/granian/server.py#L303
You can use this to override the default behaviour to loading the application, and thus you can write something like this in a run.py file:
from datasette.app import Datasette
from granian import Granian
ds_kwargs = {}
def load_app(target):
if target != "dyn":
raise RuntimeError("Should never get there")
ds = Datasette(**ds_kwargs)
return ds.app()
def main():
global ds_kwargs
ds_kwargs.update(some_util_to_convert_cli_param_to_dict())
srv = Granian("dyn", address="127.0.0.1", port=8002, interface="asgi")
srv.serve(target_loader=load_app)
if __name__ == "__main__":
main()
then running python run.py --whatever param --you-need-for ds
should work :)
That didn't quite work - the subprocess couldn't see the ds_kwargs
updated dictionary - but thanks to this I did figure out a pattern that works: https://github.com/simonw/datasette-granian/blob/0.1a0/datasette_granian/__init__.py
I passed the arguments as serialized JSON:
srv = Granian(
# Pass kwars as serialized JSON to the subprocess
json.dumps(kwargs),
address=host,
Then in load_app
:
def load_app(target):
from datasette import cli
ds_kwargs = json.loads(target)
ds = cli.serve.callback(**ds_kwargs)
return ds.app()
You can try the above out like this:
pip install datasette datasette-granian
wget https://datasette.io/content.db
datasette granian content.db -p 8000
This will start a server on port 8000 serving the Datasette interface.
@simonw probably wrapping works and is a cleaner solution. Eg:
from datasette.app import Datasette
from granian import Granian
def app_loader(kwargs):
def load_app(target):
if target != "dyn":
raise RuntimeError("Should never get there")
ds = Datasette(**kwargs)
return ds.app()
return load_app
def main():
ds_kwargs = some_util_to_convert_cli_param_to_dict()
srv = Granian("dyn", address="127.0.0.1", port=8002, interface="asgi")
srv.serve(target_loader=app_loader(ds_kwargs))
if __name__ == "__main__":
main()
Tried that just now but it didn't work - I got this error:
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'app_loader.<locals>.load_app'
How about if Granian had some kind of mechanism where you could specify a pickle-able object which should be passed to each of the workers, specifically designed for this kind of use-case?
How about if Granian had some kind of mechanism where you could specify a pickle-able object which should be passed to each of the workers, specifically designed for this kind of use-case?
Gonna think about it. Probably the theme here is making the target_loader
argument more customisable.
@gi0baro Is there a solution to the problem? I want to run application without global variables.
def app_loader(settings: AppSettings):
def load_app(_) -> FastAPI:
return register_app(settings=settings)
return load_app
def run_application() -> None:
settings = get_app_settings(
app_env=AppEnvTypes.prod if os.getenv("IS_PRODUCTION") else AppEnvTypes.dev
)
server = Granian(
"app",
address=settings.server_host,
port=settings.server_port,
interface=Interfaces.ASGI
)
server.serve(target_loader=app_loader(settings=settings))
if __name__ == "__main__":
run_application()
not work
@novichikhin what's the error? Do you have a stack trace?
@novichikhin what's the error? Do you have a stack trace?
[INFO] Starting granian
[INFO] Listening at: 0.0.0.0:8002
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\dev\python\backend\users-service\users_service\api\__main__.py", line 37, in <module>
run_application()
File "C:\dev\python\backend\users-service\users_service\api\__main__.py", line 33, in run_application
server.serve(target_loader=app_loader(settings=settings))
File "C:\Users\novichikhin\AppData\Local\pypoetry\Cache\virtualenvs\users-service-auTBzo84-py3.11\Lib\site-packages\granian\server.py", line 390, in serve
serve_method(spawn_target, target_loader)
File "C:\Users\novichikhin\AppData\Local\pypoetry\Cache\virtualenvs\users-service-auTBzo84-py3.11\Lib\site-packages\granian\server.py", line 342, in _serve
self.startup(spawn_target, target_loader)
File "C:\Users\novichikhin\AppData\Local\pypoetry\Cache\virtualenvs\users-service-auTBzo84-py3.11\Lib\site-packages\granian\server.py", line 334, in startup
self._spawn_workers(sock, spawn_target, target_loader)
File "C:\Users\novichikhin\AppData\Local\pypoetry\Cache\virtualenvs\users-service-auTBzo84-py3.11\Lib\site-packages\granian\server.py", line 313, in _spawn_workers
proc.start()
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
^^^^^^^^^^^^^^^^^
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
^^^^^^^^^^^^^^^^^^
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 94, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'app_loader.<locals>.load_app'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 132, in _main
self = reduction.pickle.load(from_parent)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
EOFError: Ran out of input
@novichikhin can you try with wrap_loader=False
in the serve method and defining/importing everything into a single method? Multiprocessing requires objects to be pickable. Something like:
def app_loader():
from whatever import get_app_settings, register_app
settings = ...
return register_app(...)
@gi0baro
def load_app() -> FastAPI:
from users_service.api.setup import register_app
from users_service.settings.main import get_app_settings
settings = get_app_settings(
app_env=AppEnvTypes.prod if os.getenv("IS_PRODUCTION") else AppEnvTypes.dev
)
return register_app(settings=settings)
def run_application() -> None:
server = Granian(
"app",
address="127.0.0.1",
port=8002,
interface=Interfaces.ASGI
)
server.serve(target_loader=load_app, wrap_loader=False)
if __name__ == "__main__":
run_application()
[INFO] Starting granian
[INFO] Listening at: 127.0.0.1:8002
[INFO] Spawning worker-1 with pid: 12772
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\novichikhin\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 132, in _main
self = reduction.pickle.load(from_parent)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't get attribute 'load_app' on <module '__main__' (built-in)>
@novichikhin I don't really know how to solve the pickling issues.
Maybe adding support for factories should solve your need?
Subscribing, this issue is unfortunately a blocker for me. I currently use hypercorn
, and a thing I rely on in testing is that I can launch several instances of it as trio
tasks (to simulate several nodes in a network), and they even support reporting back when the startup is completed via a trio
event (hypercorn
has a trio
compatibility submodule). I don't mind if it's not as efficient as launching from CLI, but it's incredibly convenient for tests.
@novichikhin pickle cannot serialise closures for obvious reasons, they are not compiled into byte-code until called at least once. If you want to use settings as an argument you can either:
(target: str, *, settings: AppSettings) -> App
__call__
to return your app from the factoryBoth variants are applicable only for global module scope, i. e. no closures :)
Also, also - if you are using __main__.py
file to start your server, you should consider moving loader somewhere else, because it's not a valid target for pickle unless you're willing to do some sketchy path hacking
# runner.py
def app_loader(_: str, *, settings: AppSettings):
return register_app(settings=settings)
# __main__.py
import functools
from .runner import app_loader
settings = ...
loader = functools.partial(app_loader, settings=settings)
Granian(...).serve(target_loader=loader)
Maybe it's little late
To avoid the pickle problem, you can use the "fork" process creation strategy instead of the "spawn".
Put it somewhere closer to the beginning of code multiprocessing.set_start_method('fork')
Also, for some PoC I used this code with app object (not str)
from granian.asgi import _callback_wrapper
from granian._futures import future_watcher_wrapper
from granian._granian import ASGIWorker
from granian._loops import WorkerSignal
import contextvars
import asyncio
from fastapi import FastAPI
sock = bind_socket('0.0.0.0', 9400)
loop = asyncio.new_event_loop()
fastapi_app = FastAPI()
asgi_worker = ASGIWorker(
worker_id=worker_id,
socket_fd=sock.fileno(),
threads=1,
blocking_threads=1,
backpressure=1024,
http_mode='1',
http1_opts=None,
http2_opts=None,
websockets_enabled=False,
opt_enabled=False,
ssl_enabled=False,
ssl_cert=None,
ssl_key=None,
)
wcallback = _callback_wrapper(fastapi_app, {}, {}, None)
wcallback = future_watcher_wrapper(wcallback)
sock.listen()
asgi_worker.serve_wth(wcallback, loop, contextvars.copy_context(), WorkerSignal()) # or serve_rth
all things are stolen from: https://github.com/emmett-framework/granian/blob/80b80cbe9585789e2cf0fea977285843a14b6e5d/granian/server.py#L212
Maybe it's little late To avoid the pickle problem, you can use the "fork" process creation strategy instead of the "spawn". Put it somewhere closer to the beginning of code
multiprocessing.set_start_method('fork')
While this is true for Linux, it won't work on other platforms, which is something to be aware of.
Also, for some PoC I used this code with app object (not str)
It's not clear to me the advantage in the proposed solution, given you're limited to 1 worker and you're sacrificing the entirety of Granian processes management (SIGHUP, workers reload, worker respawn in case of crashes, just to name a few of them).
Also, as I wrote in #330
I get the whole reduce memory usage theme, but at the same time I cannot guarantee – even considering Granian uses its own internal mutability strategy that won't rely on Python and GIL – things will work as expected when resources gets shared between workers: there's a reason behind the design of expecting the target is an importable resource and not a Python object, it is not just because we were lazy to do so :) The whole memory-shared thing in my opinion will be revisited once Python 3.13 lands, given that we can actually avoid the GIL everywhere (which also means the entirety of the application itself and all its dependencies are no-gil compatible), but for now it just doesn't feel enough safe to me
So let's just say, in general, I won't support issues caused by this kind of usage :)
In my specific case, we have a complex legacy project with its own supervisor process. In addition to managing the worker, the master process also receives some data and distributes it to the workers through pipes. So I need to create worker processes myself and pass pipes as arguments
@gi0baro Is this the correct ticket to monitor for eventually adding a CLI option allowing Granian to accept an app factory function name? I'm currently using your suggested workaround (and it works great), but I want to keep an eye out for a more official method. Thanks!
@gi0baro Is this the correct ticket to monitor for eventually adding a CLI option allowing Granian to accept an app factory function name?
Not really, please open up a separate issue for that.
This is a very exciting project - thanks for releasing this!
I tried to get it working with my https://datasette.io/ ASGI app and ran into this error:
Here's the script I wrote to replicate the problem, saved as
serve_datasette_with_granian.py
:Run it like this to see the error (run
pip install datasette
first):Are there changes I can make to Datasette to get this to work, or is this something that illustrates a bug in Granian?
Relevant Datasette code is here: https://github.com/simonw/datasette/blob/6a352e99ab988dbf8fd22a100049caa6ad33f1ec/datasette/app.py#L1429-L1454
It's applying my
asgi-csrf
ASGI middleware from https://github.com/simonw/asgi-csrfFunding