sublimehq / sublime_text

Issue tracker for Sublime Text
https://www.sublimetext.com
809 stars 39 forks source link

asyncio: Provide a global event loop for python 3.8 plugins #3129

Open rwols opened 4 years ago

rwols commented 4 years ago

Problem description

Two plugins wanting to use the asyncio module will have to interact with an event loop. The "main" loop can be obtained via asyncio.get_event_loop(). With build 4057 this loop is not yet running. This poses a "race condition" if two or more plugins want to use this global loop.

All plugins could first check for a running loop, and start it if it hasn't started yet. Some plugin authors might forget, or might just not care and start it anyway. Some plugin authors might create their own separate loop with my_loop = asyncio.new_event_loop() and starting it with threading.Thread(target=my_loop.run_forever). This last possibility is especially annoying, because the global loop from asyncio.get_event_loop() gives ST4 an opportunity to reduce its Python thread count. That is, using such a global loop is usually a superior alternative to firing up threads.

Preferred solution

SublimeText should start the global event loop from asyncio before loading plugins.

This guarantees all plugins a global running loop.

Additional Information (optional)

SublimeText build 4057.

rwols commented 4 years ago

This issue has raised questions in the Discord channel about the practically of an event loop, especially in light of the ST API. Let me preface that I do not ask to alter ST API functions. They can remain what they are.

The onus is on me to prove usefulness, so I'll try that with a tutorial of sorts.

Please create a file $packages/User/async_io.py and follow along :)

Starting and stopping an event loop

For now we want access to a loop. Let's create one ourselves in plugin_loaded and stop it in plugin_unloaded.

from threading import Thread
from typing import Optional
import asyncio

__loop: Optional[asyncio.AbstractEventLoop] = None
__thread: Optional[Thread] = None

def plugin_loaded() -> None:
    print("loop: starting")
    global __loop
    global __thread
    __loop = asyncio.new_event_loop()
    __thread = Thread(target=__loop.run_forever)
    __thread.start()
    print("loop: started")

def __shutdown() -> None:
    for task in asyncio.Task.all_tasks():
        task.cancel()
    asyncio.get_event_loop().stop()

def plugin_unloaded() -> None:
    print("loop: stopping")
    global __loop
    global __thread
    if __loop and __thread:
        __loop.call_soon_threadsafe(__shutdown)
        __thread.join()
        __loop.run_until_complete(__loop.shutdown_asyncgens())
        __loop.close()
    __loop = None
    __thread = None
    print("loop: stopped")

This will start and stop a new event loop whenever we save $packages/User/async_io.py. It shuts down in a clean manner by requesting all running tasks to cancel, then stopping the loop. The thread is joined on the UI thread of SublimeText (which should be fast, since the loop is stopping). Finally, we run the loop once more to allow coroutines to handle their CancelledError exception.

The loop is accessible through the global __loop variable.

Running blocking functions on the loop

Let's try to print to the console from a blocking function running on the loop. To do that we'll utilize sublime_plugin.EventListener and the call_soon_threadsafe method of AbstractEventLoop.

from sublime import View
from sublime_plugin import EventListener

def blocking_greet() -> None:
    print("hello from a blocking function")

class MyEventListener(EventListener):

    def on_selection_modified(self, view: View) -> None:
        if view.settings().get("is_widget", False):
            return
        global __loop
        assert __loop
        __loop.call_soon_threadsafe(blocking_greet)

Because the loop is running in a Python thread, we have to use call_soon_threadsafe, and not call_soon.

So far so good, but we could have accomplished the same with sublime.set_timeout_async(blocking_greet, 0). Let us continue on to coroutines...

Running asynchronous functions on the loop

Let's write a coroutine that prints to the console twice. The first time immediately, and the second time after one second.

async def greet_twice() -> None:
    print("hello one!")
    await asyncio.sleep(1)
    print("hello two!")

How do we run this coroutine? That is a two-step process. First, we have to get on the loop via call_soon_threadsafe. After that, we can schedule the coroutine to run with ensure_future. Let's abstract that away into a function:

from typing import Awaitable

def schedule(coro: Awaitable) -> None:
    global __loop
    if __loop:
        __loop.call_soon_threadsafe(asyncio.ensure_future, coro)

Now let's modify the MyEventListener class:

class MyEventListener(EventListener):

    def on_selection_modified(self, view: View) -> None:
        if view.settings().get("is_widget", False):
            return
        schedule(greet_twice())

Check out the console and verify that you're running a coroutine from within SublimeText!

Cancellation

Asynchronous functions may receive an asyncio.CancelledError exception when awaiting an asynchronous result. To handle such an exception, simply catch it:

async def greet_twice_cancellation() -> None:
    print("hello world!")
    try:
        await asyncio.sleep(1)
        print("hello yet again world!")
    except asyncio.CancelledError:
        print("oh no! I was cancelled! Goodbye world!")

Modify MyEventListener as follows:

    def on_selection_modified(self, view: View) -> None:
        if view.settings().get("is_widget", False):
            return
        schedule(greet_twice_cancellation())

Now modify the selection, quickly save the file and verify that the coroutine is cancelled.

The Threadpool: No Need For threading.Thread

Along with scheduling coroutines, the AbstractEventLoop can also run intensive blocking functions concurrently using its run_in_executor method. Let's try that out. First let us write a blocking function that divides a number by two:

def blocking_compute(n: int) -> int:
    print("compute: start", n)
    import time
    time.sleep(2)  # zzz... working very hard
    print("compute: done", n)
    return n // 2

Now we want to run a couple of invocations on the default threadpool and want to await the combined result.

async def compute_in_threadpool(view: View, point: int, *args: int) -> None:
    loop = asyncio.get_event_loop()
    coros = (loop.run_in_executor(None, blocking_compute, arg) for arg in args)
    results = await asyncio.gather(*coros, loop=loop)
    if view and view.is_valid():
        view.show_popup(", ".join(map(str, results)), 0, point)

Let's present it in a hover popup. Add an extra method to our MyEventListener:

    def on_hover(self, view: View, point: int, hover_zone: int) -> None:
        if view.settings().get("is_widget", False):
            return
        coro = compute_in_threadpool(
            view, point, 2, 4, 8, 16, 32, 64, 128, 256)
        schedule(coro)

Case Study: Activity Monitor

Let's write an asynchronous activity monitor in less than twenty lines of Python code, in a single function:

GLYPHS = ("-", "\\", "|", "/", "-", "\\", "|", "/")

async def activity_monitor(view: View, key: str) -> None:
    index = 0
    try:
        while True:
            view.set_status(key, f"[{GLYPHS[index]}]")
            index += 1
            index %= len(GLYPHS)
            await asyncio.sleep(0.1)
    except Exception:
        pass
    finally:
        try:
            view.erase_status(key)
        except Exception:
            pass

When the view is no longer valid, the activity monitor stops. Let's use asyncio's cancellation facilities to make the monitor stop manually:

from sublime import Edit

class ActivityMonitorCommand(TextCommand):

    keys: Dict[str, asyncio.Task] = {}

    async def run_async(self, key: str) -> None:
        task = self.keys.pop(key, None)
        if task is None:
            task = asyncio.create_task(activity_monitor(self.view, key))
            self.keys[key] = task
        else:
            task.cancel()

    def run(self, edit: Edit, key: str) -> None:
        schedule(self.run_async(key))

In the console, run

view.run_command("activity_monitor", {"key": "foo"})

Run it again to stop the activity monitor. You can run an arbitrary number of activity monitors.

Case Study: Communicating With an External Process

More advanced plugins may communicate with external programs to retrieve useful information. The usual way to go about doing that is using the subprocess module and communicating with the process either in set_timeout_async or with a threading.Thread. Let's see how we can accomplish that with asyncio.

First let's write an external python script that prints a random string on stdout with a prefix given on stdin:

$ cat hello.py
#!/usr/bin/env python3
import time
import random
import sys

if __name__ == "__main__":
    prefix = sys.stdin.read().strip()
    time.sleep(1)
    print(f"{prefix}:", random.choice(("foo", "bar", "baz")))

Now we can write a command to call the external process:

from sublime import message_dialog

class RunExternalProgramCommand(TextCommand):

    async def run_async(self, cmd: str, stdin_input: str) -> None:
        process = await asyncio.create_subprocess_shell(
            cmd,
            stdout=PIPE,
            stdin=PIPE)
        encoding = "UTF-8"
        errs = "replace"
        inputbytes = stdin_input.encode(encoding, errs)
        stdout_data, _ = await process.communicate(inputbytes)
        message_dialog(stdout_data.decode(encoding, errs))

    def run(self, edit: Edit, cmd: str, stdin_input: str) -> None:
        schedule(self.run_async(cmd, stdin_input))

Try running this locally to experience that the UI thread is not blocked at all.

Case Study: Asynchronous HTTP Requests

_This case study is hypothetical and has no code in our tutorial file async_io.py_. Suppose that the aiohttp library is available in SublimeText. We could write the following to fetch some HTML asynchronously:

import aiohttp
from sublime_plugin import TextCommand
from sublime import Edit

async def fetch(session: aiohttp.ClientSession, url: str) -> str:
    async with session.get(url) as response:
        return await response.text()

class FetchHtmlCommand(TextCommand):

    async def run_async(self, url: str) -> None:
        async with aiohttp.ClientSession() as session:
            html = await fetch(session, url)
            if self.view and self.view.is_valid():
                window = self.view.window()
                if not window:
                    return
                window.new_file()
                view = window.active_view()
                if not view:
                    return
                view.run_command("insert", {"characters": html})

    def run(self, edit: Edit, url: str) -> None:
        schedule(self.run_async(cmd, stdin_input))

Case Study: Asynchronous Completions

Let's utilize asyncio to make our life easier with the new sublime.CompletionItems. First make a hypothetical event listener:

from sublime_plugin import ViewEventListener
from sublime import CompletionItem
from sublime import CompletionList

class MyViewEventListener(ViewEventListener):

    # ...

Now in order to handle asynchronous completions, let's make the promise object and schedule an async function to fulfill that promise:

    def on_query_completions(
        self,
        prefix: str,
        locations: List[int]
    ) -> Optional[CompletionList]:
        if len(locations) != 1:
            return None
        promise = CompletionList()
        coro = self.on_query_completions_async(prefix, locations[0], promise)
        schedule(coro)
        return promise

The implementation of on_query_completions_async can be anything. Here is a mockup implementation:

    async def on_query_completions_async(
        self,
        prefix: str,
        location: int,
        promise: CompletionList
    ) -> None:
        await asyncio.sleep(1)  # fetching completions from far away ...
        snip = CompletionItem.snippet_completion
        promise.set_completions([
            snip("foo", "${1:foo}", "Does the foo"),
            snip("bar", "${1:bar}", "Does the bar"),
            snip("baz", "${1:baz}", "Does the baz")
        ])

Notice how when you type the letter "b" the UI is not blocked and ST is awaiting the promise. After one second completions appear. Consider that one may combine this with the previous case studies using external process communication and HTTP requests.

Case Study: Implement set_timeout_async in terms of asyncio

We can implement sublime.set_timeout_async in terms of async functions and our now familiar and handy schedule function:

from typing import Callable

async def __wrapper(f: Callable[[], None], timeout_ms: int) -> None:
    if timeout_ms > 0:
        await asyncio.sleep(timeout_ms / 1000)
    f()

def set_timeout_async(f: Callable[[], None], timeout_ms: int = 0) -> None:
    schedule(__wrapper(f, timeout_ms))

The function f blocks the event loop. This is okay if f is a "short" function. Otherwise, consider running f in the threadpool with run_in_executor.

Case Study: Implement Default/exec.py with asyncio

I have implemented our good old exec.py in terms of asyncio, removing the need for the subprocess and threading modules. It is a drop-in replacement. The file can be found as a gist. Check it out here. It uses our experimental async_io.py file to get access to our familiar schedule function.

  1. The start and read_fileno methods of AsyncProcess have become async.
  2. We redirect stderr to stdout.
  3. We schedule the start method with schedule from our async_io.py file.
  4. We await self.read_fileno(self.proc.stdout).

Concrete Proposal

The complete file async_io.py can be retrieved here as github gist.

deathaxe commented 4 years ago

Awesome post! Should be part of the docs some day.

deathaxe commented 4 years ago

So if ST start the event loop, the commands could be expanded to automatically support async def run() methods then.

class TextCommand(Command):
    def __init__(self, view):
        self.view = view

    def run_(self, edit_token, args):
        args = self.filter_args(args)
        if inspect.iscoroutinefunction(self.run):
            schedule(self.executor_(edit_token, args))
        else:
            self.executor_(edit_token, args)

    def executor_(self, edit_token, args):
        try:
            if args:
                edit = self.view.begin_edit(edit_token, self.name(), args)
                try:
                    return self.run(edit, **args)
                finally:
                    self.view.end_edit(edit)
            else:
                edit = self.view.begin_edit(edit_token, self.name())
                try:
                    return self.run(edit)
                finally:
                    self.view.end_edit(edit)
        except (TypeError) as e:
            if 'required positional argument' in str(e):
                if sublime_api.view_can_accept_input(self.view.id(), self.name(), args):
                    sublime_api.window_run_command(
                        sublime_api.view_window(self.view.id()),
                        'show_overlay',
                        {
                            'overlay': 'command_palette',
                            'command': self.name(),
                            'args': args
                        }
                    )
                    return
            raise

    def run(self, edit):
        pass

class MyTextCommand(TextCommand):

    async def run(self, edit, args):
        pass
wbond commented 4 years ago

This is some great info about how you'd want to use asyncio.

The most useful part to me is the bit about starting and implementing the running of the event loop itself. It does seem there are some edge cases, such as ensuring tasks can handle the cancel event.

I know @Thom1729 on Discord was really keen on trying to get the event loop to replace our existing message loop in C++ in plugin_host, primarily to get threading out of the picture altogether. I don't think that is something we'd like to tackle at this point, but he also proposed replacing our current "async" thread with an asyncio event loop. This may be slightly more possible since the async thread isn't touched in too many places, although I have not spent too much time thinking about possible implications within the C++ of plugin_host.

One reasonable takeaway from that discussion is that while the proposed asyncio event loop is pretty tidy, it will still be running all tasks in a background thread, so there will still be thread safety issues between blocking calls and async calls. That conceptually shouldn't really be too much of a surprise for plugin authors who have used the *_async() event handlers before, but it may throw some people off when using asyncio.

kaste commented 4 years ago

Interesting, shouldn't you set asyncio.set_event_loop after starting the event loop so that the end-user can just get this loop via asyncio.get_event_loop(). Your schedule function would be just sugar then. Minor, they renamed ensure_future -> create_task.

deathaxe commented 4 years ago

As the whole asyncio concept differs from what we are used to, I think an asyncio event loop should be something independend. Mixing it up with the existing threads (UI or async) will most likely break many existing plugins and force all of them to make use of the asyncio technology to work properly.

While it is a good technology to tackle slow I/O stuff like communication with network resources, processes etc. it seems too much for simple plugin tasks.

deathaxe commented 4 years ago

The following patch sets up a global asyncio event loop in a dedicated thread within the on_api_ready() function and registers a cleanup upon python exit. Not sure whether the latter one is executed, but this is how the Python38\Lib\concurrent\futures\thread.py handles the exit to cleanup thread pools.

Furthermore this patch introduces new command classes which are designed to be scheduled in the asyncio event loop.

The patch was made with ST 4060.

diff --git a/python38/sublime_plugin.py b/python38/sublime_plugin.py
index fa47f6f..69d726f 100644
--- a/python38/sublime_plugin.py
+++ b/python38/sublime_plugin.py
@@ -1,3 +1,5 @@
+import asyncio
+import atexit
 import importlib
 import io
 import marshal
@@ -13,6 +15,8 @@ import sublime_api

 api_ready = False
+asyncio_loop = None
+asyncio_thread = None

 deferred_plugin_loadeds = []

@@ -110,6 +114,34 @@ view_event_listener_excluded_callbacks = {
 profile = {}

+def setup_eventloop():
+    global asyncio_thread
+    global asyncio_loop
+    asyncio_loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(asyncio_loop)
+    asyncio_thread = threading.Thread(target=asyncio_loop.run_forever)
+    asyncio_thread.start()
+    atexit.register(python_exit)
+
+
+def shutdown_eventloop():
+    for task in asyncio.Task.all_tasks():
+        task.cancel()
+    asyncio_loop.stop()
+
+
+def python_exit():
+    global asyncio_loop
+    global asyncio_thread
+    if asyncio_loop and asyncio_thread:
+        asyncio_loop.call_soon_threadsafe(shutdown_eventloop)
+        asyncio_thread.join()
+        asyncio_loop.run_until_complete(asyncio_loop.shutdown_asyncgens())
+        asyncio_loop.close()
+    asyncio_loop = None
+    asyncio_thread = None
+
+
 def add_profiling(event_handler):
     """
     Decorator to measure blocking event handler methods. Also prevents
@@ -494,6 +526,8 @@ def on_api_ready():
     global api_ready
     api_ready = True

+    setup_eventloop()
+
     for plc in deferred_plugin_loadeds:
         try:
             plc()
@@ -1270,6 +1304,20 @@ class Command():

 class ApplicationCommand(Command):
+    def show_input_(self, e, args):
+        if 'required positional argument' in str(e):
+            if sublime_api.can_accept_input(self.name(), args):
+                sublime.active_window().run_command(
+                    'show_overlay',
+                    {
+                        'overlay': 'command_palette',
+                        'command': self.name(),
+                        'args': args
+                    }
+                )
+                return True
+        return False
+
     def run_(self, edit_token, args):
         args = self.filter_args(args)
         try:
@@ -1278,27 +1326,53 @@ class ApplicationCommand(Command):
             else:
                 return self.run()
         except (TypeError) as e:
-            if 'required positional argument' in str(e):
-                if sublime_api.can_accept_input(self.name(), args):
-                    sublime.active_window().run_command(
-                        'show_overlay',
-                        {
-                            'overlay': 'command_palette',
-                            'command': self.name(),
-                            'args': args
-                        }
-                    )
-                    return
+            if self.show_input_(e, args):
+                return
             raise

     def run(self):
         pass

+class AsyncApplicationCommand(ApplicationCommand):
+    def run_(self, edit_token, args):
+        args = self.filter_args(args)
+
+        def on_done(future):
+            e = future.exception()
+            if e and not self.show_input_(e, args):
+                traceback.print_exception(type(e), e, e.__traceback__)
+
+        coroutine = self.run(**args) if args else self.run()
+        future = asyncio.run_coroutine_threadsafe(coroutine, asyncio_loop)
+        future.add_done_callback(on_done)
+        return future
+
+    async def run(self, edit):
+        pass
+
+
 class WindowCommand(Command):
+    __slots__ = ('window', )
+
     def __init__(self, window):
         self.window = window

+    def show_input_(self, e, args):
+        if 'required positional argument' in str(e):
+            if sublime_api.window_can_accept_input(self.window.id(), self.name(), args):
+                sublime_api.window_run_command(
+                    self.window.id(),
+                    'show_overlay',
+                    {
+                        'overlay': 'command_palette',
+                        'command': self.name(),
+                        'args': args
+                    }
+                )
+                return True
+        return False
+
     def run_(self, edit_token, args):
         args = self.filter_args(args)
         try:
@@ -1307,28 +1381,53 @@ class WindowCommand(Command):
             else:
                 return self.run()
         except (TypeError) as e:
-            if 'required positional argument' in str(e):
-                if sublime_api.window_can_accept_input(self.window.id(), self.name(), args):
-                    sublime_api.window_run_command(
-                        self.window.id(),
-                        'show_overlay',
-                        {
-                            'overlay': 'command_palette',
-                            'command': self.name(),
-                            'args': args
-                        }
-                    )
-                    return
+            if self.show_input_(e, args):
+                return
             raise

     def run(self):
         pass

+class AsyncWindowCommand(WindowCommand):
+    def run_(self, edit_token, args):
+        args = self.filter_args(args)
+
+        def on_done(future):
+            e = future.exception()
+            if e and not self.show_input_(e, args):
+                traceback.print_exception(type(e), e, e.__traceback__)
+
+        coroutine = self.run(**args) if args else self.run()
+        future = asyncio.run_coroutine_threadsafe(coroutine, asyncio_loop)
+        future.add_done_callback(on_done)
+        return future
+
+    async def run(self, edit):
+        pass
+
+
 class TextCommand(Command):
+    __slots__ = ('view', )
+
     def __init__(self, view):
         self.view = view

+    def show_input_(self, e, args):
+        if 'required positional argument' in str(e):
+            if sublime_api.view_can_accept_input(self.view.id(), self.name(), args):
+                sublime_api.window_run_command(
+                    sublime_api.view_window(self.view.id()),
+                    'show_overlay',
+                    {
+                        'overlay': 'command_palette',
+                        'command': self.name(),
+                        'args': args
+                    }
+                )
+                return True
+        return False
+
     def run_(self, edit_token, args):
         args = self.filter_args(args)
         try:
@@ -1345,24 +1444,46 @@ class TextCommand(Command):
                 finally:
                     self.view.end_edit(edit)
         except (TypeError) as e:
-            if 'required positional argument' in str(e):
-                if sublime_api.view_can_accept_input(self.view.id(), self.name(), args):
-                    sublime_api.window_run_command(
-                        sublime_api.view_window(self.view.id()),
-                        'show_overlay',
-                        {
-                            'overlay': 'command_palette',
-                            'command': self.name(),
-                            'args': args
-                        }
-                    )
-                    return
+            if self.show_input_(e, args):
+                return
             raise

     def run(self, edit):
         pass

+class AsyncTextCommand(TextCommand):
+    def run_(self, edit_token, args):
+        args = self.filter_args(args)
+
+        def on_done(future):
+            e = future.exception()
+            if e and not self.show_input_(e, args):
+                traceback.print_exception(type(e), e, e.__traceback__)
+
+        future = asyncio.run_coroutine_threadsafe(
+            self.executor_(edit_token, args), asyncio_loop)
+        future.add_done_callback(on_done)
+        return future
+
+    async def executor_(self, edit_token, args):
+        if args:
+            edit = self.view.begin_edit(edit_token, self.name(), args)
+            try:
+                return await self.run(edit, **args)
+            finally:
+                self.view.end_edit(edit)
+        else:
+            edit = self.view.begin_edit(edit_token, self.name())
+            try:
+                return await self.run(edit)
+            finally:
+                self.view.end_edit(edit)
+
+    async def run(self, edit):
+        pass
+
+
 class EventListener():
     pass

Example Commands

import asyncio
import sublime
import sublime_plugin

GLYPHS = ("-", "\\", "|", "/", "-", "\\", "|", "/")

async def activity_monitor(view, key):
    index = 0
    try:
        while True:
            view.set_status(key, f"[{GLYPHS[index]}]")
            index += 1
            index %= len(GLYPHS)
            await asyncio.sleep(0.1)
    except Exception:
        pass
    finally:
        try:
            view.erase_status(key)
        except Exception:
            pass

class ActivityMonitorCommand(sublime_plugin.AsyncTextCommand):

    keys = {}

    async def run(self, edit, key):
        task = self.keys.pop(key, None)
        await asyncio.sleep(2)
        if task is None:
            task = asyncio.create_task(activity_monitor(self.view, key))
            self.keys[key] = task
        else:
            task.cancel()

class HelloWorldCommand(sublime_plugin.AsyncApplicationCommand):

    async def run(self, key):
        await asyncio.sleep(2)
        print("hello World")
daveleroy commented 4 years ago

I think you guys want to reconsider the downsides to running the event loop on another thread it makes things a lot more difficult than you realize. At the least people should play around with it to notice some of the downsides before you end up implementing this.

If you are running the loop in a background thread you then need to make any state that you access for sublime commands/events thread safe and it defeats one of the biggest benefits of asyncio. Going from Sublime's main thread to the background event loop and vice versa is annoying and error prone. I can't imagine most plugin authors are going to be able to get this right.

If the biggest use cases are creating futures/tasks that interact with Sublime api's and using things like run_in_executor to wait for tasks to complete on another thread then you really don't need to implement most of asyncio.AbstractEventLoop and just the calling functions (io routines could always be filled in later).

A second loop could always be provided that has full asyncio support on a background thread. Some utility functions like fetch could be provided which performs the work on the full asyncio loop (or a background thread) but can be called/waited for on the main loop.

Here https://gist.github.com/daveleroy/536b0a11280aa6c3ec668f4c980a866f is a very minimal and incomplete loop that provides the basic asyncio functions for futures/tasks that are ran on the main sublime thread and I am probably going to start using in https://github.com/daveleroy/sublime_debugger

deathaxe commented 4 years ago

At the moment we are just collecting ideas how to enable asyncio support in ST. Any help of experienced asyncio experts is very appreciated - especially in order not to end up in something stupid.

deathaxe commented 4 years ago

The minimal event loop from https://gist.github.com/daveleroy/536b0a11280aa6c3ec668f4c980a866f does not work as a drop in replacement. The one way or the other a proper event loop would need to support the IO completion ports on Windows, which the ProactorEventLoop() does. Otherwise it would be useless, because no asyncio could be done via such a loop.

daveleroy commented 4 years ago

It's not supposed to be a drop in replacement that is literally what I said in my comment? It doesn't provide the io sections but it does provide futures/tasks and all the async await syntax for them which I don't think you can characterize as useless.

io and literally any other async tasks can still be done on the loop you just have to implement them using a background thread (or even a full asyncio loop running on a background thread) that when complete moves the results over to the main loop.

Thom1729 commented 4 years ago

I started typing a longer response, but I'd like to highlight a possible solution implied by @daveleroy.

The trouble with running a regular event loop in the main thread is that Sublime is already running its own non-asyncio-compatible event loop, and it can't run both at once. The minimal event loop implementation hooks into Sublime's event loop to handle scheduling, but it can't handle IO like that. In a perfect world, Sublime could use an asyncio event loop instead of its current non-asyncio loop, but that would require rewriting the API interface code to use asyncio, which would presumably be a substantial effort.

A possible solution would be to use the minimal event loop, but use a slaved thread to handle IO. The IO thread would run a regular asyncio event loop. The minimal loop on the main thread would provide IO by calling out to the IO thread.

It's a bit hacky, but it should provide a fully-functional asyncio-compatible loop for the main thread. User code wouldn't have to know or care that the main loop was using a separate thread to handle IO; this would be totally transparent. From a user perspective, this should be a best-case scenario, and there's no need to rewrite or substantially modify Sublime's existing non-asyncio main loop.

Is this what you were thinking of, @daveleroy?

daveleroy commented 4 years ago

Yeah as a full solution that would be ideal but that may be a fair bit of work to do all at once which is why I was kind of hinting at providing some helper functions to provide at least some basic things like fetch and subprocess.call but not necessarily implementing all of the IO functionality (at least initially).

Filling in support for all the IO operations can be added as things progress but even just the bare event loop is still incredibly useful.

Thom1729 commented 4 years ago

If I have time tonight, I'll try to create a toy implementation using an IO thread. If it works, and if the implementation is not bad, then it might be a viable option.

FichteFoll commented 4 years ago

If you are running the loop in a background thread you then need to make any state that you access for sublime commands/events thread safe and it defeats one of the biggest benefits of asyncio. Going from Sublime's main thread to the background event loop and vice versa is annoying and error prone. I can't imagine most plugin authors are going to be able to get this right.

This is a very good point that I didn't think of when we discussed this earlier. While this is already an issue with the async thread for *_async hooks, it's true that a part of async event loops is to prevent race conditions and running the loop on a separate thread entirely defeats that purpose. There are still benefits to having an event loop generally available even in a different thread, but ideally we'd want both.

rwols commented 4 years ago

For what it's worth I have a somewhat working event loop running in the main thread (so not using threading and not using sublime.set_timeout_async), but it's kinda jerky and needs work: https://gist.github.com/rwols/2f24cdf9be284ca57614cbf45bf0fb40

It inherits from asyncio.unix_events.SelectorEventLoop and re-implements run_forever and the protected method _run_once. Most notably:

We should definitely look into a hybrid approach too.

deathaxe commented 4 years ago

Works pretty well.

Just added the following lines and put the whole SublimeEventLoop into my hacked sublime_plugins.py.


if sys.platform == 'win32':
    BasicSublimeEventLoop = asyncio.ProactorEventLoop
else:
    BasicSublimeEventLoop = asyncio.SelectorEventLoop

class SublimeEventLoop(BasicSublimeEventLoop):

...

Modifying the event_callback callers as follows ...

def run_view_callbacks(name, view_id, *args, attach=False, el_only=False):
    v = sublime.View(view_id)

    if attach:
        attach_view(v)

    for callback in el_callbacks(name):
        coro = callback(v, *args)
        if asyncio.coroutines.iscoroutine(coro):
            asyncio_loop.call_soon(asyncio_loop.create_task, coro)

    if el_only:
        return

    for callback in vel_callbacks(v, name):
        coro = callback(*args)
        if asyncio.coroutines.iscoroutine(coro):
            asyncio_loop.call_soon(asyncio_loop.create_task, coro)

def run_window_callbacks(name, window_id, *args):
    w = sublime.Window(window_id)

    for callback in el_callbacks(name):
        coro = callback(w, *args)
        if asyncio.coroutines.iscoroutine(coro):
            asyncio_loop.call_soon(asyncio_loop.create_task, coro)

... calls normal callbacks directly while scheduling async ones as follows ...

class EventListener(sublime_plugin.EventListener):

    async def on_modified(self, view):
        await asyncio.sleep(2)
        print("modify")

CAUTION: DO NOT DEFINE A async on_modified_async() AS IT WOULD BE SCHEDULED FROM THE WRONG THREAD ATM.

kaste commented 4 years ago

Honestly, I'm super confused here. What are you trying to do even?

A set_timeout event loop runs all code on the UI thread, surprises me and probably not what we want. Why not set_timeout_async? (You still can't set the timeout to 0 then so it does not count as a solution. We really want while True semantics without a sleep.)

Naively, a user could expect async def on_modified to equal def on_modified_async exactly with the meaning (not the implementation): "I'm not blocking the UI" which wouldn't be the case here.

But generally, we lack awaitables here. Do you expect "ideally" that sublime provides await view.substr_async(r) et.al. in some future? What does the event loop do if we don't have awaitables. Currently, we have sleep and subprocess, and with sleep you can't write programs.

Examples: In SublimeLinter, in whatever ... 3 kLOCs we have exactly one function doing the subprocess.communicate, everything else has no awaitable code atm, so it would not run cooperatively. Should we really make async defs everywhere down the hole until we reach that communicate function? Should we run this on a shared event loop? In GitSavvy: we have a lot of subprocess calls, but we need to be blocking here to maintain order of execution. (Basically, do not checkout while a commit runs, do not commit while staging etc.) So a "ThreadPool" with one worker does a pretty good job here.

Generally, commands run on the UI thread, esp. for the TextCommand, you can't have multiple valid edit_tokens at any time, or can we?

TL;DR Since the main event loop is a global, I understand that Sublime could prepare it so that end-users don't do it wrong. IMO it should not run on the UI thread because the whole purpose of the UI thread is to block and to be uncooperative. On the other hand we don't need two workers, the old-school worker plus a new asyncio one. So, I think, the current worker should become an asyncio thread.

deathaxe commented 4 years ago

What are you trying to do even? IMO it should not run on the UI thread because the whole purpose of the UI thread is to block and to be uncooperative.

While rwols was starting to suggest a dedicated thread for a global event loop to run many background tasks in a none blocking fashion, some of the participants find a dedicated thread to defeat the advantages an asyncio event loop provides.

So we have a background vs. main thread conversion here. The snippets and ideas just try to accomplish some concepts with the API we have knowing it is not an ideal solution though. The run_forever() should be part of the main eventloop somehow without any delays being required to keep it calm. Ideally each run_soon wakes it up and it is sent to sleep after all scheduled tasks are done.

Naively, a user could expect async def on_modified to equal def on_modified_async exactly with the meaning (not the implementation): "I'm not blocking the UI" which wouldn't be the case here.

Sure, you are completely right. It just tries to illustrate a possible way how to provide support for asyncio compatible event handlers. It is no final nor best practice solution. It also doesn't answer the UI vs. background thread question.

So a "ThreadPool" with one worker does a pretty good job here.

Of course, it does, but with each plugin creating its own ThreadPool we may quickly end up with "dozens" of idle threads in the background just sleeping or waiting for a blocking request. This is what asyncio tries to resolve/avoid.

The whole asyncio strategy is meant for tasks whose main runtime consists of waiting for external I/O from eigher a network resource or process. Instead of having a couple of sleeping threads waiting for an answer from a slow process, we'd just prefer one thread doing something in the meanwhile, assuming all tasks are short enough in computation time to not block the UI too long.

It is no general purpose solution for time intensive computations though.

Generally, commands run on the UI thread, esp. for the TextCommand, you can't have multiple valid edit_tokens at any time, or can we?

Not all text commands do something with text. They sometimes just refer to a view. You can always use a sync command and schedule an asynchronous function from there. Ideally an plugin author shouldn't need to do so but only write an async keyword in front of the run() method or the on_blablabla event handler to tell ST to be scheduled in the event loop. -> just an idea how to propably being transparent.

The main questions for me are:

  1. How to integrate the async/await mechanism into the existing infrastructure without breaking existing plugins?
  2. How to integrate the async/await mechanism in a transparent manner without too much hazzle for plugin authors?
  3. Where to run the loop background vs. main thread?

In GitSavvy: we have a lot of subprocess calls, but we need to be blocking here to maintain order of execution.

From how I understand the async/await syntax atm, you could easily write linear code with one subprocess call after another which would then be scheduled with the calling order being maintained. It should even be easier than todays callback stuff needed when handling threaded calls.

Calling

await subprocess.run("command1")
await subprocess.run("command2")
await subprocess.run("command3")

Would just run all commands in correct order but without blocking while waiting for answers, so other tasks can run in the meanwhile.


I personally don't find async/await useful for most of the simpler plugins which just run a few lines of code and return control to ST then. The effort needed for scheduling things is just too high for what we get in return. The only sense it makes if for plugins like LSP, SublimeLinter, GitGutter, and such which communicate with external processes atm.


Disclaimer: These notes and comments are made from my current and limited knowledge about asyncio. They might not be perfect nor do they intend to offend anybody.

FichteFoll commented 4 years ago

Ideally an plugin author shouldn't need to do so but only write an async keyword in front of the run() method or the on_blablabla event handler to tell ST to be scheduled in the event loop.

Not the greatest idea for text commands because of the then-redundant edit parameter still being provided. A separate class, method or decorator should make the change to a different function signature more obvious and prevent plugin authors from trying to use the invalid edit object.

deathaxe commented 4 years ago

The edit object is created via begin_edit and destroyed via end_edit. Not sure why you assume it to become invalid. It just lives a little longer.

class AsyncTextCommand(TextCommand):

    def run_(self, edit_token, args):
        args = self.filter_args(args)

        def on_done(future):
            try:
                future.result()
            except TypeError as e:
                if self.show_input_(e, args):
                    return
                raise
            finally:
                future.remove_done_callback(on_done)
                future = None

        future = asyncio.ensure_future(self.executor_(edit_token, args))
        future.add_done_callback(on_done)

        def callback():
            return future

        asyncio_loop.call_soon(callback)

    async def executor_(self, edit_token, args):
        if args:
            edit = self.view.begin_edit(edit_token, self.name(), args)
            try:
                return await self.run(edit, **args)
            finally:
                self.view.end_edit(edit)
        else:
            edit = self.view.begin_edit(edit_token, self.name())
            try:
                return await self.run(edit)
            finally:
                self.view.end_edit(edit)

    async def run(self, edit):
        pass
FichteFoll commented 4 years ago

Edit objects are synchronized on the UI thread to group actions into single undo events and allowing them to work in an asychronous context works against its internal synchronization. Either way, this is probably too much of an implementation detail to discuss heree.

kaste commented 4 years ago

I just want to say something again. An asyncio loop would be very, very useful. It would be awesome. I would use it tomorrow. I was indeed confused by some ideas here. I don't want a "second" worker because it makes everything too complicated (locks, "call_soon_threadsafe" as the entrypoint...), and I'm pretty sure I don't want it to be the ui thread basically because it breaks with how we write plugins today.

I want to make a proposal here.

T.i. Sublime does not dispatch the async variants anymore, it only talks to the UI/main thread, everything else is in python land. (Luckily, we only have very few entrypoints from Sublime host to the worker thread client. (Unlike the UI thread which handles tricky jobs e.g. when you "run_command" from the worker thread, the worker blocks, the ui thread python code runs, and then after the command finished the worker resumes. 🤯))

This is intended to be 100% compatible. As long as nobody awaits something we only put relatively large, non-cooperative tasks on the event loop which just run ordered fifo.

After that plugin authors can start to split up their code in smaller chunks. E.g. GitSavvy could compute intra-line diffs on the ThreadPool, but via await we have easy semantics to join (the result) on the worker. E.g. for SublimeLinter preparing the error panel if you have hundreds of errors blocks the worker too long. With asyncio we could delegate to a common ThreadPool here and again join on the worker. So the main driver here in Sublime land is not to parallel fetch 100 urls, but to make the worker tasks smaller and more cooperative.

rwols commented 4 years ago

@daveleroy

I think you guys want to reconsider the downsides to running the event loop on another thread it makes things a lot more difficult than you realize. At the least people should play around with it to notice some of the downsides before you end up implementing this.

If you are running the loop in a background thread you then need to make any state that you access for sublime commands/events thread safe and it defeats one of the biggest benefits of asyncio. Going from Sublime's main thread to the background event loop and vice versa is annoying and error prone. I can't imagine most plugin authors are going to be able to get this right.

This is not a problem because we have sublime.set_timeout to run things on the UI thread. And to be honest, I would rather see such an event loop not run/poll on the UI thread as this slows things down for regular use.

I fully agree with @kaste's outline about replacing the existing worker thread, but there's probably going to be some backwards-incompatible changes when doing so.

rwols commented 4 years ago

I have changed the title to "provide a loop" because it's a bit ambiguous what "the" loop is. At least,

daveleroy commented 4 years ago

This is not a problem because we have sublime.set_timeout to run things on the UI thread. And to be honest, I would rather see such an event loop not run/poll on the UI thread as this slows things down for regular use.

My concern is not about running things on the ui thread which you can easily do.

My concern is that if asyncio is added people are going to want to use it because it's a useful language feature and they are going to ignore that it is running on a separate thread which you can't really do when the majority of entry points into a plugin are coming from the ui thread and will be accessing state that is shared between them.

rwols commented 4 years ago

What you describe is a well-known gotcha of running functions in the existing worker thread with sublime.set_timeout_async. Can you give some examples that people will trip over if such an event loop were to run in a separate thread?

An event loop that runs in the UI thread would also have gotchas; you'd have to check whether the view is valid after an await anyway:


class Listener(sublime_plugin.ViewEventListener):

    def on_modified(self):
        sublime.get_event_loop().create_task(self.do_things_non_blocking())

    async def do_things_non_blocking(self):
        data = await get_some_data()
        if not self.view.is_valid():
            return   # oops, the view died in the meantime
        self.view.add_regions(data)
daveleroy commented 4 years ago

Not having to worry about data races between schedule points is one of the biggest advantages of asyncio and cooperative multitasking. Cleaning up that gotcha by explicitly marking schedule points is a big part of why asyncio is useful.

Your example illustrates that by explicitly giving up control and having the state change under you. In the multithreaded version that shared state could change anywhere not just between where you explicitly gave up control.

rchl commented 4 years ago

Using this feature would be tricky (especially for novices) but I don't think that should be an argument for not implementing it. Blocking progress just because it might be misused is not the way to go. Instead that should be handled by educating people (starting from documentation).

There are already many ways to create bad code that will cause issues and people will do that regardless of what tools we give them.

daveleroy commented 4 years ago

Adding a background loop isn't really the issue if that is the only thing that is getting added.

However there are already a fair number of requests for various async apis that if added basically every plugin author is going to start using. They shouldn't be tricky to use and they shouldn't introduce threading issues and frankly they shouldn't have to.