Closed antocuni closed 8 months ago
@antocuni, good news.... we disagree! π€£
Ok, with my serious face on, I do agree that the current design/implementation is prone to confusing edge cases but, to properly tackle it, I think we need to separate the topics list above: 1. how we handle execution flow (and related time to fetch and load files) 2. execution as modules or in the global namespace
On 1 , the problem here is the question between serializing vs parallel. Basically, should we execute <py-scripts>
in order and wait for each tag to load or run before we go with the next one vs. can we run all <py-script>
like they are independent and decoupled from each other? This is actually also very relevant to the work that needs to be done as we move towards supporting web workers btw...
On the above, I vote to give users an option and have an explicit attribute that tells us how to execute their code.
On the topic 2, I do acknowledge that we do have a corner case in the situation presented above, when using a module both as module (loaded in paths) and as src
in the <py-script>
tag but I think it's a problem on better defining the current API rather than a full redesign. In fact, I'm -10 on all the proposals above. It really feels unnatural and over complicated.
In my mind, the very simple solution is that we do not allow users to both <py-config>paths=['foo.py']</py-config>
and
<py-script src="foo.py"></py-script>
and we raise an error if they do. In fact, in that case they should have done <py-script src="bar.py"></py-script>
(where bar contains import foo
) or <py-script >import foo .... </py-script>
If we really want, we could also support <py-script import="foo.py"></py-script>
but I'd find it confusing and I can't find a reason users would be doing that instead of <py-config>paths=['foo.py']</py-config>
. With that said, if there's an use case for it and our users need something like that, I'm more than happy to change my mind.
@antocuni, good news.... we disagree!
π π
Ok, with my serious face on, I do agree that the current design/implementation is prone to confusing edge cases but, to properly tackle it, I think we need to separate the topics list above: 1. how we handle execution flow (and related time to fetch and load files) 2. execution as modules or in the global namespace
Yes, sorry for having mixed the two. I started to write the issue to explain (2) and when I tried to write the example I noticed (1). I couldn't explain (2) without explaining (1), that's why they are together.
My take on (1) is that the approach "let's happily allow top-level await and use runPythonAsync
without really understanding all the implications" is causing way too many problems and should be rethought. But that's a topic for another discussion, let's not have it here.
[cut]
On the topic 2, I do acknowledge that we do have a corner case in the situation presented above, when using a module both as module (loaded in paths) and as
src
in the<py-script>
tag but I think it's a problem on better defining the current API rather than a full redesign. In fact, I'm -10 on all the proposals above. It really feels unnatural and over complicated.
The difference in our views is that what you consider a "corner case" it's really a fundamental property of the current semantics of src
, which doesn't play with the Python semantics. We cannot change Python semantics but we can chance src
.
Before going in the details, a quick on the "design methodology" that we should follow IMHO
In my mind, the very simple solution is that we do not allow users to both
<py-config>paths=['foo.py']</py-config>
and<py-script src="foo.py"></py-script>
and we raise an error if they do.
this is not a simple solution, is a workaround to fix one specific problem of this approach, but it doesn't solve the more fundamental problems. Let me try to explain better.
The current semantics of <py-script src=...>
is roughly the following:
import __main__
pysrc = open(src).read()
exec(pysrc, __main__.__dict__)
I.e., you are taking the python code and executing it in the global namespace (which is just the namespace of the __main__
module -- that's how pyodide works).
This opens a can of worms because it leads to all sorts of confusing behaviors.
The first random example which comes to my mind:
# rectangle.py
import js
from pyodide.ffi import create_proxy
def draw():
print('drawing a rectangle')
def on_click(event):
print('clicked on btn-rect')
draw()
on_click = create_proxy(on_click)
js.document.getElementById('btn-rect').addEventListener('click', on_click)
# circle.py
import js
from pyodide.ffi import create_proxy
def draw():
print('drawing a circle')
def on_click(event):
print('clicked on btn-circle')
draw()
on_click = create_proxy(on_click)
js.document.getElementById('btn-circle').addEventListener('click', on_click)
<py-script src="rectangle.py"></py-script>
<py-script src="circle.py"></py-script>
Live example If you click on the button "Draw rectangle", you see this in the console:
clicked on btn-rect
drawing a circle
(the explanation of why is left to the reader).
Tell me, isn't this very confusing behavior?
And note that it's not even completely invented: this is a pattern which I saw in real world examples of people using <py-script src=...>
, for example here:
https://github.com/nmstoker/ChessMatchViewer
https://github.com/nmstoker/ChessMatchViewer/blob/main/chess_script.py#L143-L151
And here: https://github.com/kolibril13/pyscript-emoji-skimage https://github.com/kolibril13/pyscript-emoji-skimage/blob/main/emoji_playground.py#L147-L153
"Computer Science 101" explanation
The current model completely break composability of different .py files. The behavior of the file completely depends on which other files and <py-script>
tags were executed before, and it's basically impossible to avoid breaking stuff accidentally.
The first big problem is that we mixing everything in a single global namespace. The most notable example of language with this behavior is C, and all modern languages recognized that it's a problem and implemented namespaces in one way or another
The second big problem is that with two notable exceptions (see below) all languages which I am aware of treat files as a distinct unit of compilation which can be analyzed individually.
Even in C, with its global namespace, you can analyze a single file a know which function calls are resolved internally and which ones depend on external symbols. But apparently PyScript decided that it's smarter than everyone else and that concatenating multiple files into a big unique chunk of code is a good idea :man_shrugging: .
The two notable exceptions (that I am aware of) to this rule are:
#include
in C. But in practice, #include
is used only for .h
files so individual files are still analzyable<py-script>
: everything is in a single global namespace, and the content of each file depends on the content of every other file which happens to be in the page.Ironically, the JS world quickly recognized that this was a huge problem and came up with multiple solution to overcome it (e.g., every file contains a big anonymous function which defines its own scope and it's called immediately). And then they invented 10 different ways of defining modules. And now they are even in the HTML so you can say <script type="module" src=...>
.
In the JS world it took years to fix the original sin of "everything is global". Let's avoid doing the same mistake again.
NOTE: I am 100% aware that we need a way to split behavior into multiple files. But not with this semantics.
This is a fascinating discussion. Thank you so far for such a thoughtful read.
What follows is only a small fragment of what I originally wrote. I had to write a whole bunch of stuff, only to be able to refine my thinking to get to the (much shorter) contribution offered below.
@fpliger's distinction between the problems is spot on: order of execution vs scope of execution.
Let me briefly deal with the first from MicroPyScript's point of view:
Put simply, it's all handled async via messages. When the source code of a script is obtained, either from the innerHTML or by fetching from the URL in src
(incidentally, if the fetch doesn't work... you get a non-200 response..., it raises an exception and everything grinds to a halt, so fundamental is this problematic state of affairs), it dispatches a py-script-loaded
event. If this happens before the runtime is finished loading, the script is put on a pending queue, which is evaluated in order when the py-runtime-ready
event is dispatched. If the runtime is ready and the script's source becomes available afterwards, it's evaluated immediately. It's "just" simple coordination via events.
But I've encountered doubts about the <py-script>
tag itself. At least, how it currently is.
Before exploring these doubts, it's important to acknowledge that the <py-script>
tag is of fundamental importance to PyScript. It's the "101" first encounter folks have with PyScript and has two very important benefits:
src
attribute.BUT..!
I think @antocuni eloquently describes an outlook similar to my doubts. In summary:
<pyscripy>
tag? Right now, and perhaps because the runtimes don't support any other way, it's all in the global __main__
scope. This is not a good idea for all the reasons Antonio states.<pyscript>
tags are loaded, in both main PyScript and MicroPyScript, currently depends upon how fast we can extract the source. Inline code will run before anything fetched from the network, and anything fetched from the network will be dependent on the vagaries of latency and availability.Multiple scripts in the __main__
scope is a problem. Therefore (bear with me) we should only allow a single <py-script>
tag on any page. Put simply, it is the equivalent of the main
function... the single entry point.
Should you require other code, then do the Pythonic thing: stick it into a module, then reference it in the <py-config>
so you can import my_code
in the code contained in the single <py-script>
tag (incidentally, this is why I'm working hard to get file-system support in MicroPython...).
But what about Python fragments in a web page? If you mean you want to scatter your code in different tags, I'd say, you're doing it wrong... scatter it in different modules and do what I suggest via <py-config>
. Alternatively, if you want multiple Python fragments on a web page for "it's a notebook" type reasons, then we should do ourselves a favour and name things properly via an <py-notebook>
tag that is an "or" with <py-script>
(you can have one or the other, but not both on the same page).
To recap, only a single (main) <py-script>
tag. You can't have <py-script>
and <py-notebook>
in the same web page.
How does <py-notebook>
work..? Well, it's basically a REPL session with non-code prose interspersed (and, incidentally, if all you want is a REPL, just use <py-repl>
):
<py-notebook>
<h1>A header</h1>
<p>Some prose</p>
<code>
# The <code> block is rendered with a "run" button, to evaluate the (content-editable=true)
# content of the block.
print("This output is automatically put into a div inserted as a sibling node immediately below this one.")
foo = "bar"
</code>
... etc...
<p>Here's more arbitrary HTML</p>
<code>
# This code is still in the same scope as the previous <code> block.
print(foo)
</code>
</py-notebook>
In this case, the special, and completely understandable, aspect of running multiple <code>
child nodes of a <py-notebook>
tag is that they're all in the same scope (just like a notebook should be). Worth pointing out, just like <py-script>
you can't have more than one <py-notebook>
tag on a page.
These are raw suggestions, and they definitely need refining. However, they solve both the out-of-order execution problem along with the "it's all in __main__
scope" problem (except where this is explicitly a feature of the <py-notebook>
tag).
Also, I realise these are "opinionated" solutions, but this is OK. I think we can all agree the reasons for such "opinionated" ways of working are good ones. For instance, there's a good reason Guido opined that Python should have functions and not GOTO
. π
As always, I'm not precious about anything I write, and I'm interested to hear your thoughts and refine ideas so we get to the "good place" :tm:.
Thoughts...?
Mmmm... @antocuni @ntoll I feel like we are all thinking about this at different levels and with different use cases in mind.
I honestly don't think the core of the issue itself is related to src
or to the <py-script>
but rather to the namespace. Let me comment on specific points
Tell me, isn't this very confusing behavior?
Yes but it's confusing but I don't think it's due to the <py-script >
tags themselves but a misuse of them. Basically, they should have been loaded as modules and been accessed as
on_click = create_proxy(circle.on_click)
js.document.getElementById('btn-circle').addEventListener('click', circle.on_click)
and
on_click = create_proxy(rectangle.on_click)
js.document.getElementById('btn-circle').addEventListener('click', rectangle.on_click)
In 1, 2 or N <py-script>
tags.
A PyScript tag without output
can be used also to couple small pieces of logic with where results will be displayed. Yes, you can also do that by explicitly passing the output to render
but using different pyscript
tags makes it more explicit and my preference for certain users.
More in a bit...
I think @antocuni eloquently describes an outlook similar to my doubts. In summary:
How isolated should we run the code in a
tag? Right now, and perhaps because the runtimes don't support any other way, it's all in the global main scope. This is not a good idea for all the reasons Antonio states. The order in which different tags are loaded, in both main PyScript and MicroPyScript, currently depends upon how fast we can extract the source. Inline code will run before anything fetched from the network, and anything fetched from the network will be dependent on the vagaries of latency and availability.
That's my thinking as well. The problem is related to scopes/namespaces and execution flow, not the src
or the <py-script>
tag itself.
Multiple scripts in the main scope is a problem. Therefore (bear with me) we should only allow a single
tag on any page. Put simply, it is the equivalent of the main function... the single entry point.
Yup!
But what about Python fragments in a web page? If you mean you want to scatter your code in different tags, I'd say, you're doing it wrong... scatter it in different modules and do what I suggest via
. Alternatively, if you want multiple Python fragments on a web page for "it's a notebook" type reasons, then we should do ourselves a favour and name things properly via an tag that is an "or" with (you can have one or the other, but not both on the same page). To recap, only a single (main)
tag. You can't have and in the same web page.
π΅βπ« Ok, you lost me here. I am +1 on the first of your proposal about best practices [that we should promote and try to enforce as much as possible] and all but totally miss the rest. It's definitely not a secret that the vision for PyScript includes being able to run more than one runtime (and eventually different languages) on the same page. Making <py-script>
a single instance is a non-solution.
I'd much rather introduce namespaces and define defaults and possibility for customizations that reduce the surface for confusion but at the same time allow users freedom if they know what they are doing.
In this case, the special, and completely understandable, aspect of running multiple
child nodes of a
tag is that they're all in the same scope (just like a notebook should be). Worth pointing out, just like you can't have more than one tag on a page.
Ok, let's imagine the scenario where a user has a Scientific dashboard with different simulations/data analysis/models in the same page and as part of their "exploration app" they want to add a "notebook" (which is basically a REPL or collection of REPLs) to allow their users to explore each single viz/model/dataset by allowing them to run snippets that can interact with the data. How should these users do that in a world where there can only be one?
Ok, let's imagine the scenario where a user has a Scientific dashboard with different simulations/data analysis/models in the same page and as part of their "exploration app" they want to add a "notebook" (which is basically a REPL or collection of REPLs) to allow their users to explore each single viz/model/dataset by allowing them to run snippets that can interact with the data. How should these users do that in a world where there can only be one?
That's exactly what I was getting at with my (rough and ready - it needs refining) suggestion for a <py-notebook>
tag. :-)
Great minds think alike...
I may try to bodge something together next week in MicroPyScript to show what I mean.
I honestly don't think the core of the issue itself is related to
src
or to the<py-script>
but rather to the namespace. Let me comment on specific points
The problems I'm complaining about are the result of a combination of src
+ global namespaces, because it means that if you look at a single file .py you cannot reason about it in isolation.
Let's see all possibilities:
src
: the current statussrc
: this is better: each tag depends on the other but they are all in the same phycical HTML file, so it's easier to reason aboutsrc
: this solves the issue, but if we load a file and execute it in its own namespace, it's essentially a module. Then let's call it with its proper namesrc
: my preferred solution.Tell me, isn't this very confusing behavior?
Yes but it's confusing but I don't think it's due to the
<py-script >
tags themselves but a misuse of them. Basically, they should have been loaded as modules and been accessed as
THANK YOU for explaining me the solution, I knew that π . But the resulting behavior is confusing, impossible to debug and it's against everything that people learn in python courses. We are giving people a gun and ask them no to shoot themselves in the foot.
A PyScript tag without
output
can be used also to couple small pieces of logic with where results will be displayed. Yes, you can also do that by explicitly passing the output torender
but using differentpyscript
tags makes it more explicit and my preference for certain users.
Serious question: do we have any real use case in mind? I have tried to search a bit around for real apps that people have written, and they always use a single <py-script>
(or multiple <py-script src=...>
, but probably without realizing how dangerous it is).
Also, this model has a fundamental drawback:
<py-script>
in the HTML depends on how you want to visualize stuffI am bit skeptical that it's actually useful in practice, but I'm happy to be convinced otherwise. @JeffersGlass you have written a lot of examples, so I think that your experience is valuable here.
Ok, you lost me here. I am +1 on the first of your proposal about best practices [that we should promote and try to enforce as much as possible] and all but totally miss the rest. It's definitely not a secret that the vision for PyScript includes being able to run more than one runtime (and eventually different languages) on the same page. Making
<py-script>
a single instance is a non-solution.
this is a good point about having multiple <py-script>
tags. But at the same time, it's also a big point on having each tag in its own namespace, because if you have different runtimes/languages you cannot share them "implicitly" as it's happening now.
@antocuni I should really set up an alert for the word 'namespace' in this Repo π
To be honest, I haven't been following this conversation closely to this point - I'll happily digest it and weigh in presently.
I know this is well understood by this group and has already been said, but to spell it out because it will help make my point: there's ongoing confusion for end users, I think, because we have two ways of doing (essentially) the same thing (and because URLs and local file paths look so similar).
The src
attribute as written loads content from a URL:
<!-- copy the contents of a file at the relative URL (and execute it in global namespace) -->
<py-script src="./foo.py"></py-script>
but using import
relies on the module being present in the (Emscripten) file system, so we (currently) use paths:
to get fetch() the file and dump it there:
<!-- copy the contents of a file at the relative URL "./foo.py" to a Emscripten local file called "./foo.py" -->
<py-config>
paths=['./foo.py']
</py-config>
<!-- import the MODULE called foo (as found by Python's importers) -->
<py-script>
import foo
</py-scipt>
Getting the files from the Network to the file system is what the \<py-config> (previously \<py-env>) paths
list is meant to accomplish - fetching remote network resources and turning them into "local" files that Python's importers know what to do with. We've mused elsewhere that "paths" is probably not the clearest name for this feature. Perhaps fetch
or web_resources_to_fetch_to_EM_filesystem
or something.
In these two snippets, './foo.py` means something entirely different. And in a sense, we have two different ways to acheive (almost) the same result:
I think having both is unncessary, and so...
src = "..."
This is not at all what I thought when I started writing this, but Iβve been convinced - given the two options above, I think the second (paths+import) is better. Fetching the contents of a web resource, ignoring its file and running the code in the global namespace isn't really something that maps to Python in a clear way.
Personally, I've view src=
as a convenience feature - "My code is too long for this spot in my HTML, let me move it elsewhere," since code from src
functioned the same as inline code. The workflow I see for folks making things in PyScript (both folks in the Discord, and myself) is:
Without src="..."
, the last step gets just a little longer - adding the file name to \<py-config> paths
and replacing the inline code with import foo
.
We could consider adding some kind of attribute to the \<py-script> tag itself to accomplish what paths
does, but let me not get bogged down there. I want to get to multiple tags and namespacing:
If you mean you want to scatter your code in different tags, I'd say, you're doing it wrong...
For the sake of projects like The 7 Guis, the asyncio post, or the rich demos, I'd push back on the idea @ntoll that only one \<py-script> or \<py-notebook> tag be allowed on a page. Not that those are even particularly complex projects, as far as multi-compnent web apps go. The point is: locating of scripts near the place that they output to the page is really valuable when dealing with a larger document.
Even more so: allowing multiple \<py-script> tags on a page allows including \<py-script> tags into other components or templates. For example:
<py-repl>
tag for each demo, and destroys it when the next demo starts. These ideas break if we are only allowed one script tag on the page
A PyScript tag without output can be used also to couple small pieces of logic with where results will be displayed. Yes, you can also do that by explicitly passing the output to render but using different pyscript tags makes it more explicit and my preference for certain users.
Do we have any real use case in mind? I have tried to search a bit around for real apps that people have written, and they always use a single \<py-script> (or multiple \
, but probably without realizing how dangerous it is).
It me, the use case π. The Hugo Shortcode demo I linked above relies on \<py-script> tags outputting in place. When you're composing \<py-script> tags into components, in a way that you may not know if you can have a unique ID for outputting to a separate location, the ability to output in-place and locate tags at their output is key.
And yeah, src=
is dangerous and it's easy to have variable name collisions. Hence, let's kill src="..."
. (I also wrote the original version of that Emoji Playground example you listed, back before I knew better... that way there be dragons.)
@ntoll Not having a single-entry-point for code is weird, I agree, and it would sure simplify things if we did. But it limits page composition, and the amount to which we can compose PyScript tags into other frameworks.
Allowing multiple tags to have the option to share a non-global namespace would be great - especially when they're outputing "in-place" via display(). Consider:
#<!-- Near the top of a component -->
<py-script namespace="math">
# This is the key to this whole component
def do_magic_math(x):
return (3.14 * x) + (1/234987 + 2**x)
</py-script>
....
#<!- Near a dependent html component -->
<input id="my_input">
<py-script namespace = "math">
from js import document
from pyodide.ffi.wrappers import add_event_listener
def magic_update(*args):
val = do_magic_math(document.getElementByID("my_input"))
display(f"<h1>{val}</h1>")
add_event_listener(document.getElementById("my_input", "change", magic_update))
</py-script>
...
<input type="checkbox">
<py-script namespace=βmathβ>
# If do_magic_math() < 5 check the checkbox
</py-script>
... #etc
So, I'd propose to revive an idea from a previous discussion on Namespaces, and give \<py-script> an attribute (which only applies to inline code, since src= is dead) which executes the enclosed code in the named namespace. If such a name already exists in sys.modules
it is executed in the global namespace of that module; if it doesn't, a module is created by that name, and the code is executed in a new dictionary for that namespace.
This brings us back to an issue we crashed into last time, which is needing/wanting the contents of pyscript.py
and all its useful PyScript methods to be available in each of these namespaces. We considered adding it to builtins, but I believe where we landed is that the cleanest thing would be to wrap everything in that file into a proper module and allow users to import it (or import it for them @fpliger).
BUT! This additional namespacing for me is a nice to have, and an extension to the above. Anything you can do here, you can do with a little more effort by placing code in external files and importing it or parts of it, this is just a convenience/cleanliness feature, I think. So given it's want to reorganize pyscript.py
, let's not wait on it.
Thanks, @JeffersGlass for the thoughtful comment (as usual). I'm +1 on most of it so I'll comment on the things I'm not really aligned with.
Personally, I've view src= as a convenience feature - "My code is too long for this spot in my HTML, let me move it elsewhere," since code from src functioned the same as inline code. The workflow I see for folks making things in PyScript (both folks in the Discord, and myself) is:
- Start writing Python inline in a
tag, since it's fast and easy (and fast and easy is a great feature) - Realize the code's getting longer and could be helped by code-completion/linting/not cluttering up the HTML file
- Move it to its own '.py' file and add a 'src=" reference in the
tag
I don't think that captures the full essence of why one would use src
. In addition to the above, I'd add:
(I feel there's probably more though)
The idea, since the beginning, was that inline would be an entry point to get users to working code quickly but then support and encourage adding moving their code to src
and external files.
And yeah, src= is dangerous and it's easy to have variable name collisions. Hence, let's kill src="...". (I also wrote the original version of that Emoji Playground example you listed, back before I knew better... that way there be dragons.)
I'm not sure I get the idea behind it being considered dangerous. Or, to better put, I think I do get why but think the problem is not src
on its own but the fact that:
<py-script>
tag will effectively result in something comparable to splitting a main
entrypoint file being split into multiple chunks<py-script>
tagsNow, with that said....
> <!-- copy the contents of a file at the relative URL "./foo.py" to a Emscripten local file called "./foo.py" -->
> <py-config>
> paths=['./foo.py']
> </py-config>
>
> <!-- import the MODULE called foo (as found by Python's importers) -->
> <py-script>
> import foo
> </py-scipt>
>
This is not the solution... cannot be the solution. Really, it feels so unnatural and verbose. We went from 1 line to 6 with that horrible pattern of importing a module just to execute code in a namespace. It'd be (a bit) different if we are actually invoking something after an important but, even then, why? We are adding an anti-pattern just to solve for a (bunch of) bug(s).
My firm belief here is that we won't fix the problem by creating an awkward API but rather by fixing the issue at the root by fixing the execution flow (ensuring we can guarantee the order of execution), deciding if we need to put limitations on the number of py-script
tags per page/namespace/etc.. and by supporting namespaces.
This brings us back to https://github.com/pyscript/pyscript/pull/503#issuecomment-1204435027, which is needing/wanting the contents of pyscript.py and all its useful PyScript methods to be available in each of these namespaces. We considered adding it to builtins, but I believe where we landed is that the cleanest thing would be to wrap everything in that file into a proper module and allow users to import it (or import it for them @fpliger).
Yes but, again, I think that supporting namespaces on py-script
tags makes the API (and UX) slightly different than just just using Python modules as namespaces.
IIRC, some things might have changed enough since we have the namespaces discussion so that we might find convergence now. (One can hope :) )
BUT! This additional namespacing for me is a nice to have, and an extension to the above. Anything you can do here, you can do with a little more effort by placing code in external files and importing it or parts of it, this is just a convenience/cleanliness feature, I think. So given it's want to reorganize pyscript.py, let's not wait on it.
I both agree and disagree here... While it's true for "developer" users (because they have the tools to understand the difference and do more complex things) convenience/cleanliness is often what makes a Plaftorm/Framework succeed or fail compared to others (especially for non-expert users). (By saying this I'm not saying we should prioritize namespaces before everything else, but I think it's part of a success story)
My firm belief here is that we won't fix the problem by creating an awkward API but rather by fixing the issue at the root by fixing the execution flow (ensuring we can guarantee the order of execution), deciding if we need to put limitations on the number of
py-script
tags per page/namespace/etc.. and by supporting namespaces.
the order of execution is not the core issue here. It's incidental and I had to mention it only to explain why the example was to awkward. Let's forget about it.
The real core issue is that multiple <py-script>
tags share the same scope. You cannot have a shared scope and src
without causing endless confusions and corner cases. I don't know if I have any veto power, but in case I do I will exercise it to forbid this solution at all costs.
You can have src
if every <py-script>
tag has its own isolated scope. I think this should be the solution, and you also get namespaces almost for free.
(Then, incidentally: the concept of "let's execute this .py file in its own scope" already exists in Python and it's called "module". But I'm fine to call it differently if you are really allergic to this name :man_shrugging: )
- testability: it's hard and ugly to test any code that is inlined. In fact, I'd say it's not really testable right now
+1 for this, but note that in order to test them outside a pyscript app, you need to import them as modules. Another hint that probably they are modules ;)
the order of execution is not the core issue here. It's incidental and I had to mention it only to explain why the example was to awkward. Let's forget about it.
Yes and no... not the core issue but it definitely exacerbates it. Let's agree to park it.
The real core issue is that multiple
tags share the same scope. You cannot have a shared scope and src without causing endless confusions and corner cases. I don't know if I have any veto power, but in case I do I will exercise it to forbid this solution at all costs.
I'm glad you are proposing a compromising solution next because I think we'd be vetoing each other to death on this one lol
You can have src if every
tag has its own isolated scope. I think this should be the solution, and you also get namespaces almost for free.
+1 on this. I think it's probably the only way we can converge. I'd also suggest that, if we all agree on this and start on this direction, we start simple and small (the feature is just that: Each one in their own namespace, namespaces can't access each other, 1 per tag, etc..), and add features on top of it as we go and find need for it.
(Then, incidentally: the concept of "let's execute this .py file in its own scope" already exists in Python and it's called "module". But I'm fine to call it differently if you are really allergic to this name π€·ββοΈ )
We had this conversation before. I really believe this is not true (it is from a technical point of view but not from an UX one). It's like saying "everything is a dict in Python", while it's true and you can do a lot once you realize this, it's not something users need to know and use nor is the reason people love Python.
I see it more like different entry points (to different processes) that are part of the same application and [in some occasions] that may need to access each other/share data. They serve 2 different purposes. That's why I don't like the name, it's misleading, imho.
You can have src if every tag has its own isolated scope. I think this should be the solution, and you also get namespaces almost for free.
+1 on this. I think it's probably the only way we can converge. I'd also suggest that, if we all agree on this and start on this direction, we start simple and small (the feature is just that: Each one in their own namespace, namespaces can't access each other, 1 per tag, etc..), and add features on top of it as we go and find need for it.
+1 on this as well - as you say @fpliger - it's probably right that we give it a try and build on it as needed.
And as we noted in #503, there's likely to always be some way to access global scope in a pinch (my_var = js.pyscript.runtimes.globals.get('a_global_var')
springs to mind, so truly daring devs who are willing to accept the risks can push the envelope if need be.
+1 on this. I think it's probably the only way we can converge. I'd also suggest that, if we all agree on this and start on this direction, we start simple and small (the feature is just that: Each one in their own namespace, namespaces can't access each other, 1 per tag, etc..), and add features on top of it as we go and find need for it.
ok, works for me! To be 100% sure that we are on the same page, this is different than the original "namespace proposal" which we discussed long time ago in #503. The old proposal was "global namespace by default, opt-in for private namespace". Here we are going "private namespace by default, no way to access other namespaces". In other words:
<py-script>
x = 42
</py-script>
<py-script>
print(x) # NameError
</py-script>
Did I understand correctly?
That's what I understand, we're proposing, yeah.
To be clear, I would still like a way for tags to ultimately share namespaces... but I think for the sake for being able to move forward, that should be a feature that gets added on once the bones of this proposal are in place. I'll say my piece and be done.
The use case for me personally, for tags to be able to share a namepace (whether they use src=
or inline code) is explanatory blog posts about Python/Pyscript. They often are structured, at least in part, as follows: (The {{}} notation is shorthand for "load the convent of a file and display it as code; I use Hugo personally but it could be any templating system)
<p>Start by doing this thing:</p>
{{ content from step1.py }}
<py-script src="step1.py"><py-script>
<p>Now, do this next step</p>
{{ content from step2.py }}
<py-script src="step2.py"></py-script>
<p>And finally, do this</p>
{{ content from step3.py }}
<py-script src="step3.py"></py-script>
<p> As you can see, this process works</p>
My rich demo post works like this, as do parts of this post on JS object creation, and I have two in the works now, on FileIO and on developing package patches for Pyodide.
The ability for tags to share a namespace (currently, the global namespace, which I acknowledge is bad) allows for breaking the code down into files by chunks that make sense in terms of their intended us on the page. And yes, you cannot reason about them individually as Python files, but personally that matters less than the code as displayed and as run being identical because they reference the same source.
As I've said elsewhere, though, my use cases tend to be more "Using PyScript to talk about Python/PyScript" rather than "Use PyScript to do things on a website," so my usage may not be the target one.
As I've said elsewhere, though, my use cases tend to be more "Using PyScript to talk about Python/PyScript" rather than "Use PyScript to do things on a website," so my usage may not be the target one.
I think you have a point here, and your case is probably perfectly valid: in this use case, you are basically mixing code and text into an unique flow, which is something which you cannot do normally but becomes very easy and natural in the context of a web page. I think we should fully support it.
What is the best way to support this use case, I don't know. One possibility, as you suggest, is to make it possible to share the same namespace across multiple py-script tags. Another is more similar to what @ntoll suggested earlier in this conversation, i.e. to have a <py-notebook>
tag which does exactly that.
So, I don't know how far along you might be in implementing these ideas you are, and I don't want to get in the way of anything. But here's a thought.
What we had last converged on is that by default, each \<py-script> tag has its own scope/namespace, yes? I'm still onboard with that. So each time we hit a new \<py-script> tag, we'll need to create a new dict() to use as that namespaces globals, and 'initialize' it. (Right now, probably just run the contents of "pyscript.py" in that namespace; maybe later we import from a module).
Let me assume for a sec these namespaces have unique names, and that we store them in some kind of mapping, either on the TS side or the Python side: namespace_colletion = {'first': { global objects from first tag }, 'second': { global objects from second tag }}
etc.
What about a attribute <py-script namespace="...">
? Here's some pseudocode:
//pyodide.ts
//New function:
function runInNamespace(code:string, namespace_name:string){
this.interpreter.runPython(sourcecode, {globals: namespace.namespace_name})
}
function run(code:string){
if this.hasAttribute('namespace') and this.namespace in namespace_collection.keys(){
//eval() code using existing namespace as globals
this.runInNamespace(sourcecode, namespace_collection.namespace_name)
}
else {
//need to initialize a new namespace
if this.hasAttribute('namespace'){
//eval code using existing namespace
new_namespace_name = this.getAttribute('namespace')
}
else {
new_namespace_name = ??????
}
new_namespace_globals = this.runtime.get('dict')();
namespace_collection[new_namespace_nname] = new_namespace_globals
this.runInNamespace(pyscript as string, new_namespace_globals)
//and any other init steps, like runtime.run('set_version...')
this.runtime.run(code, new_namespace_globals)
}
}
If each tag is to have its own namespace by default, then "?????" is ... a GUID? Or maaaaybe something derived from src
, but since src
is a URL and not necessarily a file name I'm wary about that...
(If each tag were to default to the same namespace, "?????" would be "__main" or "__default_namespace" or some constant. But I think we've moved away from that.)
So the example above becomes, with extension:
<!--------- First Section --------->
<p>Start by doing this thing:</p>
{{ content from step1.py }}
<py-script src="step1.py" namespace="first_demo"><py-script>
<p>And now do this thing</p>
{{ content from step2.py }}
<py-script src="step2.py" namespace="first_demo"></py-script>
<!--------- Second Section --------->
<p>Let's look at something else now:</p>
{{ content from secondDemo1.py }}
<py-script src="secondDemo1.py" namespace="second_demo"></py-script>
<p>And with that something else, can also do this:</p>
{{ content from secondDemo2.py }}
<py-script src="secondDemo2.py" namespace="second_demo"></py-script>
Eh? As a place to start? I've made quite a few assumptions here, though, so this could be entirely off base.
As you've very helpfully illustrated before, creating a dict with a name and using it as a global dictionary for some code is basically reinventing the idea of modules. Honestly I'm neutral on whether we actually create modules using this logic or. If we do create them as modules, though, using module=...
as an attribute could make some sense.
As there's no label in this and it's from 2022 ... I think we can close this.
The current behavior of
<py-script src=...>
is confusing and suboptimal IMHO. It is implemented in this way: https://github.com/pyscript/pyscript/blob/214e39537bf18e1bec65153fdaa2fce355999693/pyscriptjs/src/components/pyscript.ts#L19-L31i.e., it fetches the URL and just executes it in the global namespace. But it has many problems:
since we are using
await fetch()
, the code might be executed out of order w.r.t. the py-script which are inline. Consider e.g. this test:If I run the test on my machine, I get the two prints in a random order:
But if I uncomment the line which makes foo artificially bigger, I get this consistently, because the file takes longer to download:
This is the latest new entry in the list of problems caused by the fact that we use
runPythonAsync
intead ofrunPython
.Related: https://github.com/pyscript/pyscript/issues/878 and https://github.com/pyscript/pyscript/issues/879
config.paths
. For example, consider the following test (theasyncio.sleep()
are needed to work around the previous problem):This is what you get:
This happens because the code inside
foo.py
is actually executed twice: withsrc="foo.py"
we execute it in the global namespace, then withimport
we execute it again in the proper module. So we get two copies ofX
andsay_hello()
, each working independently of each other.This behavior is very confusing unless you know very well the internals of Python, and we should avoid it at all costs.
To underline how confusing it is, we even made a mistake in our own docs π± https://github.com/pyscript/pyscript/blob/214e39537bf18e1bec65153fdaa2fce355999693/docs/reference/elements/py-script.md#L37-L53
in the docs above the author felt the need to add
compute_pi.py
toconfig.paths
, but it's not really needed and the file will be actually downloaded twice.Proposals for solution
We should avoid
exec()
uting in the global namespace Python code which comes from a file. This is very confusing, it plays very badly with the Python import system and it generates a lot of unexpected behavior.Corollary of the previous point: the supported/encouraged way to execute external Python code is to use
import
(either implicitly or explicitly, see below). This means that the external .py file needs to be fetched separately and saved to the virtal FS.We need to decide how it interacts with
config.paths
and decide whether we want to provide special syntax for it or now.Proposal 1: just kill
src
We don't really need it, it is possible to achieve the desired result in this way:
Simple, effective, very explicit, works out of the box.
Proposal 2: kill
src
but add animport
(orpy-import
?) attributeThis would be the equivalent to the previous example:
Proposal 3: automatically add imports to paths
Similar to proposal (2) but you don't need to explicitly add
paths=[...]
This is by far my least favorite, because it opens many questions (e.g., if I do
import="foo.py"
does it mean that I want to download and import./foo.py
or that I want to import the already-installedpy
module from thefoo
package?Moreover it complicates the implementation because we would need to search for
import
attributes when we download the otherpaths
, etc.