There are lots of things already using async, including the REST API calls and Search engine passthroughs. But the pipeline and file IO are not yet awaitable and need to be.
Specifically, focus needs to be made here:
Async tokenization step: since spaCy uses Cython for pipeline processing it should be possible to enable non-blocking tokenization of documents (remember that parsing a doc with spaCy can take a while)
File read and write, especially during storage and (re)indexing. When a document needs to be sent to disk or pulled off disk and sent to a search engine, the operation should await a file write/read operation.
There are lots of things already using async, including the REST API calls and Search engine passthroughs. But the pipeline and file IO are not yet awaitable and need to be.
Specifically, focus needs to be made here: