Closed dbieber closed 4 years ago
Agreed that this is an important use case! I'm currently working on a solution for this, based on discussions in #5 and #6, see https://github.com/hauntsaninja/pyp/pull/5#issuecomment-626389709
Oh neat! That's a really cool way to satisfy this use case. Looking forward to seeing what you come up with.
Hmm... One drawback with the config file idea however is that imports take time, sometimes quite substantial time.
For example, Tensorflow takes 6s to import on my machine :/.
> time pyp 'tensorflow'
<module 'tensorflow' from '/Users/dbieber/.virtualenvs/_3/lib/python3.7/site-packages/tensorflow/__init__.py'>
real 0m6.025s
user 0m2.326s
sys 0m0.967s
> time pyp '1' # As a baseline
1
real 0m0.129s
user 0m0.064s
sys 0m0.050s
If the aliases are implemented as a core feature, then these import costs only need to be paid when the libraries are used. However if the config is run with every invocation of pyp, then all usage of pyp gets slowed substantially if I include the tensorflow import in the config.
Agreed.
The plan is to treat a config file not as Python to be unconditionally executed, but as a source of definitions of names. So we'll statically analyse the config file, see that import tensorflow as tf
defines "tf", and only include it in the code we execute / the script we print if we end up needing tf to be defined.
One exception to this is if you do from tensorflow import *
in your config, in which case if we have missing names we need to define, there's no way to avoid importing tensorflow to see if that statement would define them.
One exception to this is if you do from tensorflow import * in your config, in which case if we have missing names we need to define, there's no way to avoid importing tensorflow to see if that statement would define them.
When I was working on this code last night, I was thinking that when we hit these import *
s in the config file, we ought to time them, and then warn or something if they're slower than some threshold. Or potentially disallow... It's already a subset of python syntax that's going to be supported, so we don't necessarily need to be as general as possible if we think it's going to create too many foot-guns.
I would support disabling import *
.
Do you have a plan for how to handle name collisions?
You'll need such a plan even without import *
, but it's significantly more important if you allow import *
.
The reasons you'll need to handle name collisions even without import *
are:
import numpy as np
and import jax.numpy as np
(in which case you could take the one that occurs later in the config, like Python does -- and warn?)import numpy as stdin
, clobbering one of the builtin names (disallow and warn in this case?)I suppose the natural thing to do (with or without import *) is to take the definition occurring later in the config, as iiuc Python does.)
I've put up an implementation at #13 and I'd love feedback!
What I've got:
pyp x[:3]
won't import stuff--explain
, so it should always be easy to see what's going onI'd like to support wildcard imports; they enable use cases like #8 well. If there are performance problems for some modules, the workaround is in any case what you'd be forced to do in the absence of wildcard support.
Thanks for opening the issue and discussing the details.
I just cut a release with an implementation of configuration, so pip install --upgrade pypyp
and this should be possible!
I also updated the README, here's the relevant section: https://github.com/hauntsaninja/pyp#pyp-is-configurable
Here's the complete changelog: https://github.com/hauntsaninja/pyp/blob/master/CHANGELOG.md#v03
Try it out and let me know if you have thoughts! I'm closing this for now, but feel free to reopen if you have feedback
In Python there are several standard import aliases. E.g.
import numpy as np
,import networkx as nx
,import tensorflow as tf
, andfrom matplotlib import pyplot as plt
. I think it would be a great addition to pyp to support these import aliases. This will make using pyp for e.g. numpy processing more pleasant.To be concrete, currently
pyp 'numpy.add(1, 2)'
works butpyp 'np.add(1, 2)'
does not, and I am suggesting adding support for the latter.