omni-us / jsonargparse

Implement minimal boilerplate CLIs derived from type hints and parse from command line, config files and environment variables
https://jsonargparse.readthedocs.io
MIT License
302 stars 41 forks source link

Add support for async objects in `CLI` #517

Closed rusmux closed 3 weeks ago

rusmux commented 1 month ago

🚀 Feature request

Add support for objects that require asynchronous context for their creation.

Motivation

Currently I cannot instantiate objects like aiokafka.AIOKafkaProducer. For example:

In main.py:

import aiokafka
from jsonargparse import CLI

async def main(producer: aiokafka.AIOKafkaProducer) -> None:
    print(producer)

if __name__ == "__main__":
    CLI(main)

In config.yaml:

producer:
  class_path: aiokafka.AIOKafkaProducer
  init_args:
    bootstrap_servers: ["localhost:9092"]

In terminal:

python -m main --config config.yaml

Gives error:

RuntimeError: The object should be created within an async function or provide loop directly.

Pitch

The above code should runs without errors.

Alternatives

mauvilsa commented 1 month ago

I am not sure that this can be fixed. Classes are instantiated before calling the function. And being required to only be run inside an async function is very specific to aiokafka.AIOKafkaProducer and no way to introspect it.

Not tested, but how about:

async def cli():
    CLI(main)

if __name__ == "__main__":
    await cli()

Or maybe instead:

async def main(producer: Callable[[], aiokafka.AIOKafkaProducer]) -> None:
    producer_instance = producer()
    ...

Side note. I merged a pull request fixing --print_config since it wasn't working for this case.

rusmux commented 1 month ago

If I use the first option, then I get RuntimeWarning: coroutine 'main' was never awaited. Maybe CLI would determine if an asynchronous function has been passed to it?

Also, I found that I was running the code using jsonargparse 4.27.0. On 4.29.0, I get an error if I try to run my example code:

usage: main.py [-h] [--config CONFIG] [--print_config[=flags]] [--producer.help CLASS_PATH_OR_NAME] producer
error: Parser key "producer":
  asdict() should be called on dataclass instances
mauvilsa commented 1 month ago

Maybe CLI would determine if an asynchronous function has been passed to it?

I think it is possible with inspect.iscoroutinefunction.

mauvilsa commented 1 month ago

I implemented support for async functions in #531. But note what I mentioned before, aiokafka.AIOKafkaProducer being required to be instantiated inside an async function is rather particular and it is not supported as you had it. How the flow goes is, classes are first instantiated, and after these are passed as parameters to whatever function is being called. It would become extremely complex trying to figure out if the function is async and then fake an async function within which to instantiate classes. And I doubt classes required to be instantiated inside an async is common enough to justify this complexity. Furthermore, there is a workaround. A working version using #531 of what is proposed in the description is:

import asyncio
from typing import Callable
from jsonargparse import CLI
from aiokafka import AIOKafkaProducer

async def main(producer: Callable[[], AIOKafkaProducer]):
    producer_instance = producer()
    print(producer_instance)
    await asyncio.sleep(0)

if __name__ == "__main__":
    CLI(main)