Open hoodmane opened 3 years ago
Your understanding of everything is pretty much accurate.
Would you consider just asking your students to add async
and await
everywhere? I understand it is a notational pollution that could be avoided using native IO and may be hard for students to understand. I guess it may be pedagogically bad to make students keep track of two different types of functions and two different calling conventions... But it's unfortunate that Python has a language builtin tool to help you deal with this situation and you can't use it. And these days there are many native IO libraries that use async
/await
rather than select.
I've investigated using emscripten's Asyncify framework, and I'm pretty sure I can get that to create a synchronous C function that will await a JavaScript promise. But I'm not sure if I can call the C function from Python in a way that is compatible with the Asyncify system. And even if I can, it will be pretty kludgy.
I think this might be the least worst solution.
I'm now thinking about using shared memory between the browser thread and the worker, then running a busy loop in the worker that polls for a response via the shared memory.
Please no! This would surely work, but it is pretty sad...
So I just wanted to check, is there currently any path to get synchronous Python code to wait for a JavaScript promise or async function?
One remaining route you could try: use Atomics.wait
and Atomics.notify
. The docs say that this should allow you to block a worker thread while waiting for a response from a different thread. I tried to get this to work before but it didn't seem to do anything. Note that you can only use Atomics.wait
on an Int32Array
, for some bizarre reason it is disabled on other TypedArray types.
Note also that Atomics.wait
has no Safari support. It has two years of support on Chrome and six months of support on Firefox:
https://caniuse.com/mdn-javascript_builtins_atomics_wait
@hoodmane, thanks a lot for the confirmation and suggestions.
I've done some tests using emscripten's Asyncify with a C function, but the news is not good, at least for me. I was hoping to add a synchronous C function to the module, then call it from Python as if it were a synchronous JavaScript function, then have it pause (and run the event loop) until the promise was resolved. However, it turns out that the description of Asyncify is not quite right (although the name is). Asyncify effectively turns the C function into an async function. It doesn't let you call async functions synchronously and wait for the result. Instead, if you call a C function from JavaScript, and the C function uses Asyncify to await JavaScript code, then the C function is split and returns 0 immediately. Later, when the call is finished, Asyncify rewinds the C function and runs the portion after the handleAsync
call. This is very similar to how an async JavaScript function works.
I was hoping that Asyncify somehow paused the C function and processed the JavaScript event loop for a while, before returning control back to the C function, which could then return a value to whatever called it. I think Python's run_until_complete()
does something like that, but I haven't seen anything similar for JavaScript. In retrospect I should have guessed that wasn't happening here, since the Asyncify documentation talks about rewinding the stack, which is more like what async/await does.
So to use Asyncify for this, I think I would need build pyodide myself using the Asyncify flags. Then I'd need to use Asyncify.handleAsync to call async JavaScript code from Python. I think I would also need to tell Asyncify the name of the C function that calls the JavaScript code -- I couldn't figure out where that happens after a few minutes of searching. (It would be really cool to add a built-in behavior in pyodide that uses Asyncify.handleAsync whenever pyodide is running synchronously and the user calls an async JavaScript function. If that only ever happened in one special-purpose function, maybe that would reduce the performance problems?)
That's a little more than I can take on for now. I'll take a look at the other options, or maybe just have students presupply inputs in a textbox. That's good enough for this; I was just hoping to get the nice back-and-forth input dialog working.
Have you looked into Atomics.wait
? It sounds extremely promising.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Atomics/wait
I was a little scared off by your comment that "I tried to get this to work before but it didn't seem to do anything." And I'm anxious about narrowing the field of supported browsers so much. But I could give it a quick look, at least for proof of concept.
This recent blog post seems to say that Atomics.wait
works for what you want here:
https://jasonformat.com/javascript-sleep/
They have a demo here: https://sleep-sw.glitch.me/ ant the source code: https://glitch.com/edit/#!/sleep-sw?path=worker.js%3A1%3A0
I got blocking IO working with Atomics.wait
and Atomics.notify
here:
https://hoodmane.github.io/worker-pyodide-console/
https://github.com/hoodmane/worker-pyodide-console
It also handles keyboard interrupts. The github link only works in Chromium-based browsers because it isn't served with the correct cross origin policy, but served from an appropriate server it would work in Firefox too.
Hey, I needed this too, and I've got some code in progress that throws an exception to jump out of python to JavaScript then recreates the python stack to go right back to the original exception location. It seems to work okay. I have students with iOS who have to use Safari, so I couldn't use web workers.
What this means is that a) you can jump out to receive JavaScript messages any time you want. b) you can do delays or io waits in JavaScript without spin locking anything.
https://github.com/joemarshall/unthrow/blob/main/unthrow/unthrow.pyx
It should work in wasm but I haven't built it in pyodide yet.
@mfripp not sure if you're following this issue but I'm also doing similar to you, creating a consistent online python environment for students to use, and the stuff I posted above about checkpointing python so JavaScript can run might be of use to you also.
function run(code) {
// async input
pyodide.globals.input = async function () {
return new Promise((r) => setTimeout((_) => r('async_input')))
}
// +await
code = code.replace(/\binput\s*[(]/g, 'await $&')
// run
pyodide
.runPythonAsync(code)
.then((output) => output && console.log(output))
.catch((err) => {
console.log(err)
})
}
As an update, I think @joemarshall's unthrowables are a good idea. Hopefully we can get some version of them added in v0.18.
Closed in favor of #1503.
I'm also trying to port an educational platform (http://futurecoder.io/) to pyodide. Nice to have so much company!
unthrow looks very interesting, but I don't feel great about it. It also seems to rely on sys.settrace() which prevents other debuggers like pdb from working, so this goes against #550.
Atomics.wait() seems like it works great from your demo, and I'm not too bothered about telling Safari users to use Chrome and not support mobile.
But that blog post https://jasonformat.com/javascript-sleep/ suggests another ingenious idea: synchronous XHR intercepted by a service worker. I managed to get this working, see https://replit.com/@alexmojaki/pyodide-service-worker-input for source and https://pyodide-service-worker-input.alexmojaki.repl.co/ to try it out. Open the dev console to see when pyodide is ready. Then for example you could first run this code:
print(1)
print(input() * 2)
print(2)
print(input() * 3)
print(3)
and then 'Run' two arbitrary strings which will go into input()
.
Here's how it works. When you first click Run, it posts a message to the web worker:
worker.postMessage(code.value)
The web worker listens for messages and passes the code to pyodide:
pyodide.globals.get("run_code")(e.data)
The python function run_code
calls exec(code)
which eventually hits input()
, calling sys.stdin.readline
, which is patched by the custom function get_input()
, which calls the JS function getInput()
in the web worker:
function getInput(output) {
// Tell the browser thread about input printed so far, and that user is waiting for input()
postMessage({awaitingInput: true, output: output});
const request = new XMLHttpRequest();
request.open('GET', '/get_input/', false); // `false` makes the request synchronous
request.send(null);
return request.responseText;
}
The request is intercepted by the service worker:
let resolver;
addEventListener('fetch', e => {
const u = new URL(e.request.url);
if (u.pathname === '/get_input/') {
e.respondWith(new Promise(r => resolver = r));
}
});
addEventListener('message', event => {
resolver(new Response(event.data,{status:200}));
});
The service worker responds with a promise that is resolved when it receives a message from the browser thread. Meanwhile back in the browser thread, the message is received from the worker's getInput()
so that it can display the output so far and note that python's input()
is waiting. Therefore it knows to send the next message to the service worker instead of the web worker, which will eventually be received in python's input().
I'm very new to all of this stuff so I may be missing something critical, but it seems to work fine. I think service workers may be killed randomly so getInput()
will probably need to make requests in a loop.
It also seems to rely on sys.settrace() which prevents other debuggers like pdb from working, so this goes against #550.
I have a way to fix this, the low level work for this is already underway: https://github.com/pyodide/pyodide/blob/main/cpython/patches/0001-Add-pyodide-callback.patch
synchronous XHR intercepted by a service worker
Cool idea. So now we have three possible approaches:
Ideally we could use unthrow for development (setting up a service worker can be a bit of a pain, and it breaks the "refreshing the page gives a completely fresh start," using a normal worker at least has no persistent state but still slows development).
Then once logic is more or less ready, we could switch to 2 or 3 for release.
Have to be careful to make it so we can swap out this different techniques, but I have ideas for that.
I've written a library https://github.com/alexmojaki/sync-message which helps with synchronous IO with a worker using either SharedArrayBuffer or a service worker. I'm using it in my own https://github.com/alexmojaki/futurecoder and I've also integrated it into https://github.com/dodona-edu/papyros. It has no dependencies and is meant to fit any use case, not just Pyodide. I think it would fit well in https://github.com/hoodmane/synclink.
Poking this thread, it appears that Atomics.wait
and SharedArrayBuffer
are now both available in Safari on Desktop and Mobile (since December 2021):
https://caniuse.com/mdn-javascript_builtins_atomics_wait https://caniuse.com/sharedarraybuffer
Can anyone else confirm that? Maybe I'm reading it wrong.
So perhaps the right approach is for Pyodide to provide a Promise
shim for this so the stdout
/stderr
/stdin
callbacks passed to loadPyodide
can be async functions? Then the whole service worker part of this can be avoided. The same shim could be used for other streams too (e.g., webrtc, websockets).
Yes, SharedArrayBuffer has been available on Safari for a while, but it requires cross origin isolation on any browser, and that's often a problem on its own. In futurecoder it messed with a few things so I chose to drop that path and only use service workers, but still with sync-message.
If you're happy for your site to be served with cross site isolation enabled, then it is possible to enable isolation using a service worker. This means you can write everything with atomics, which works great and feels like the right way until the stack switching proposals become implemented in browsers. And you can still host it anywhere e.g. GitHub pages.
Service worker enabling isolation doesn't work in incognito mode, but neither do service worker based solutions either, so no loss there.
For those looking for a solution using service workers and synchronous XHR requests, I've added support for this in my library react-py, runnable example here.
Of course you don't have to use React, the code can be found on GitHub - look for the workers
and providers
directories.
Tested with Vite, Next.js and Docusaurus + GitHub pages (docs site).
Is there any conclusion on a recommended way to implement Pyodide in a web worker for IO input?
It seems that there have been new developments in this area elsewhere. So it would be kinda cool to have an update on an overview. @alexmojaki has done a lot of work on this: sync-message, pyodide-worker-runner @hoodmane has also done work on it: synclink
@hoodmane also started the following issue: Improved Webworker ease of use @jtpio started the issue: Matplotlib backend in a webworker
We should make an overview of packages and their use cases in a webworker.
I imagine all these functionalities should be built and documented inside Pyodide at some point. Right now, we use plugins for it. I'd be really helpful to know exactly which plugins to use.
For debugging, there seems to be the following two packages: snoop, birdseye. @alexmojaki I see your contributions on Pyodide all over the net. Are you associated or in contact with the Pyodide developers in any way? I imagine they could really use your work.
I myself am replacing Skulpt with Pyodide for the following website XLogo. And I really want to accomplish that task in a stable and maintainable way.
Here's a rough guide:
For a full example of pyodide-worker-runner in action, see:
@alexmojaki I see your contributions on Pyodide all over the net. Are you associated or in contact with the Pyodide developers in any way? I imagine they could really use your work.
I had a chat with @hoodmane and there was some interest in using sync-message at the core of synclink, but then nothing happened.
Ultimately IMO it would make sense to use sync-message as a foundation everywhere to abstract away the nitty-gritty technical details of synchronous communication and allow easily switching between COI and service workers. I'm curious as to why @elilambnz and @ojeytonwilliams chose to reimplement this stuff themselves.
For debuggers:
stdin
to work synchronously, then pdb
(the default used by breakpoint()
) should just work.pyodide-worker-runner
combined with python_runner
(as you may be doing already to take care of all the sync input stuff) then python_runner
makes it easy to use snoop
.I myself am replacing Skulpt with Pyodide for the following website XLogo. And I really want to accomplish that task in a stable and maintainable way.
@Hephaistos7 are you still working on porting XLogo? Is your work on this public?
I myself am replacing Skulpt with Pyodide for the following website XLogo. And I really want to accomplish that task in a stable and maintainable way.
@Hephaistos7 are you still working on porting XLogo? Is your work on this public?
Unfortunately, the work isn't public and my time actively working on it is over. But the port worked and we are using Pyodide successfully, using COI and alexmojaki's libraries: sync-message, pyodide-worker-runner and comsync (as linked in the previous answer)
I'm building a web page where students can edit Python code, then click "Run" to run it in pyodide. I started with all of this running in the main browser thread, which worked pretty well. As part of this, I added some glue code so that
input()
in Python became aprompt()
in the browser.However, when pyodide runs in the main thread, my code cannot update the display (e.g., show output), and controls are locked, so I can't have a "Stop" button to kill Python code that runs out of control. So I am working on a version where pyodide runs as a web worker. I've got that working for most of what I want, except for getting input back from the user.
Specifically, I send messages to from the worker to the main browser thread to launch the prompt, then send a message back from the main thread to the worker with the result. The problem is, I can't see any way to get the Python code to wait for the return message. I think that may be possible in general, if I write all-async code. The problem is, the incoming code is not written for an async environment (e.g., the students will use
input()
, notawait input()
). So even if I patchinput()
, my new version has to be a sync function to give a result back to the main code. So then my patchedinput()
can'tawait
an async function, e.g., a JavaScript promise that clears when the message comes back from the main browser thread.I tried using
asyncio.get_running_loop().run_until_complete()
to await the promise, but this doesn't do anything, as noted above. I've investigated using emscripten's Asyncify framework, and I'm pretty sure I can get that to create a synchronous C function that will await a JavaScript promise. But I'm not sure if I can call the C function from Python in a way that is compatible with the Asyncify system. And even if I can, it will be pretty kludgy.I'm now thinking about using shared memory between the browser thread and the worker, then running a busy loop in the worker that polls for a response via the shared memory (since I don't think sleep is possible). But I'd rather not use such a narrow solution and peg the processor that way.
So I just wanted to check, is there currently any path to get synchronous Python code to wait for a JavaScript promise or async function?
Originally posted by @mfripp in https://github.com/iodide-project/pyodide/issues/1158#issuecomment-775632060