grumpyhome / grumpy

Grumpy is a Python to Go source code transcompiler and runtime.
Apache License 2.0
420 stars 18 forks source link

'print' statement ignores 'sys.stdout' #39

Open alanjds opened 6 years ago

alanjds commented 6 years ago

google/grumpy#290 opened by @alanjds on 22 Apr 2017

Replacing sys.stdout have no effect over print statement.

This test script outputs AssertionError: 0 chars printed, instead of 3, after printing 'foo' in the stdout.

import StringIO
import sys

sio = StringIO()
stdout = sys.stdout
sys.stdout = sio

print 'foo'

sys.stdout = stdout

chars = sio.tell()
assert chars == 3, '%s chars printed, instead of 3' % chars
alanjds commented 6 years ago

Comment by trotterdylan Saturday Apr 22, 2017 at 22:41 GMT


Effectively, the sys module does this:

from __go__.grumpy import Stdout

stdout = Stdout

The Print() function in the Go runtime writes directly to grumpy.Stdout. It does not get it from the sys module. So things work fine until you re-bind sys.stdout, as you've found.

There are a couple ways to fix this:

  1. Write the sys module in Go and have the grumpy package import it at package init time so that the Print() function can access the sys module and write to the stdout attribute instead of a global Go variable. This is close to how CPython works. However, I think it would require adding some special casing into the import system, which could get ugly.

  2. Make the sys module an instance of some class where stdout is a property whose setter updates the value of Stdout in the grumpy runtime package. Because there's no way to assign to Go package variables from Python at the moment, this would require adding a SetStdout() function to the grumpy runtime package, which feels a little hacky.

In short, there's no great way to fix it at the moment, however, I think some avenues will open up once Grumpy is self-hosted.

alanjds commented 6 years ago

Comment by alanjds Sunday Apr 23, 2017 at 03:57 GMT


What about implementing the print statement as a pure-python function, and hook the golang Print function to the python one?

I will need to reimplement the print stuff very soon anyway...

alanjds commented 6 years ago

Comment by trotterdylan Sunday Apr 23, 2017 at 16:18 GMT


If, for example, you define print() in __builtins__.py then you have the problem of accessing the __builtins__ Go package from within the grumpy runtime package, because they each would depend on each other.

alanjds commented 6 years ago

Comment by alanjds Sunday Apr 23, 2017 at 18:57 GMT


I am thinking about defining some _print_statement.py and hooking Print() to use it.

Ideally, it should import only sys.stdout

I did not yet took a look there to see what the Print() imports

Il 23 apr 2017 1:18 PM, "Dylan Trotter" notifications@github.com ha scritto:

If, for example, you define print() in builtins.py then you have the problem of accessing the builtins Go package from within the grumpy runtime package, because they each would depend on each other.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/grumpy/issues/290#issuecomment-296454318, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJf5ywht3Zc_SoCZu71nbK2xoVoBm7Dks5ry3m_gaJpZM4NFOm8 .

alanjds commented 6 years ago

Comment by trotterdylan Sunday Apr 23, 2017 at 19:13 GMT


I think the fundamental issue you're doing to run into is circular dependencies between the grumpy runtime and the packages generated from the Python module you write. On Sun, Apr 23, 2017 at 11:57 AM Alan Justino da Silva < notifications@github.com> wrote:

I am thinking about defining some _print_statement.py and hooking Print() to use it.

Ideally, it should import only sys.stdout

I did not yet took a look there to see what the Print() imports

Il 23 apr 2017 1:18 PM, "Dylan Trotter" notifications@github.com ha scritto:

If, for example, you define print() in builtins.py then you have the problem of accessing the builtins Go package from within the grumpy runtime package, because they each would depend on each other.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/grumpy/issues/290#issuecomment-296454318, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAJf5ywht3Zc_SoCZu71nbK2xoVoBm7Dks5ry3m_gaJpZM4NFOm8

.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/google/grumpy/issues/290#issuecomment-296479478, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHLexHc0dg-_mbK7ycf_BujVrHN6Psdks5ry58EgaJpZM4NFOm8 .

alanjds commented 6 years ago

Comment by alanjds Sunday Apr 23, 2017 at 22:20 GMT


I just found the Print() function, and it should not rely on 'file' arg being a File anyway, otherwise "print" will never accept a duck-typed file-like, in my opinion.

For now, my plan is to factor it out from builtins, even if it is always loaded

Will draw some sketches on dead-trees to help with the circular dependencies. Maybe lazy-importing sys.Stdout, with a local cache to not penalize the most used case. I like it not, however.

alanjds commented 6 years ago

Comment by trotterdylan Sunday Apr 23, 2017 at 22:22 GMT


+1 to taking any object instead of a file. Looking forward to seeing what you come up with. On Sun, Apr 23, 2017 at 3:20 PM Alan Justino da Silva < notifications@github.com> wrote:

I just found the Print() function https://github.com/google/grumpy/blob/91ffc9fd18ee51cce79d2930e6850e284c3cd721/runtime/core.go#L1261, and it should not rely on 'file' arg being a File anyway, otherwise "print" will never accept a duck-typed file-like, in my opinion.

For now, my plan is to factor it out from builtins, even if it is always loaded

Will draw some sketches on dead-trees to help with the circular dependencies. Maybe lazy-importing sys.Stdout, with a local cache to not penalize the most used case. I like it not, however.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/google/grumpy/issues/290#issuecomment-296493024, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHLe2DGzZa32iGJWegwGVKaV1agVe62ks5ry86YgaJpZM4NFOm8 .

alanjds commented 6 years ago

Comment by cclauss Monday Apr 24, 2017 at 08:03 GMT


Are there a test cases for Python scripts that start with from future import print_function? That import makes the printer statement, the >> and the trailing comma all invalid. It also enables the end function parameter.

alanjds commented 6 years ago

Comment by alanjds Monday Apr 24, 2017 at 14:11 GMT


@cclauss Yeah, but is a flag already handled by the compiler. I remember to saw some ifs in the compiler code about this. Print from "print statement" and from "print function" are different code paths, per file compiled, IIUC.

alanjds commented 6 years ago

Comment by alanjds Monday Apr 24, 2017 at 14:29 GMT


@trotterdylan: Thinking in Print() check for 'sys' in SysModules and, if found, use 'sys.stdout'. If not found, fallback to Stdout as the 'sys' was not imported and sys.stdout is sure to be Stdout anyway.

alanjds commented 6 years ago

Comment by trotterdylan Tuesday Apr 25, 2017 at 04:42 GMT


Although that solution should work (if sys was not imported, then there's no way for someone to override it from Python), it's not very satisfying IMO. It's pretty easy to imagine a scenario where this logic would diverge from CPython (probably involving messing around with sys.modules). Although these scenarios are far fetched, I'm more worried that this implementation divergence from CPython will cause other, more subtle problems. I think the right approach is to make the sys module "built in" in a similar way to CPython.

A variation on option 1. above is to do exactly what CPython does and define the sys module within the runtime itself: https://github.com/python/cpython/blob/2.7/Python/sysmodule.c

This would still require some changes to the import system to special case sys, but it's probably the best option we have available in the short term.

alanjds commented 6 years ago

Comment by corona10 Tuesday Apr 25, 2017 at 07:04 GMT


@trotterdylan I definitely agree with your idea. If we need modules of python which are implemented by C-level API. At that case, We should implement it by Go also.

alanjds commented 6 years ago

Comment by alanjds Tuesday Apr 25, 2017 at 15:08 GMT


I do not know if I like this path.

I remember of news of some conference (could not find the reference, sorry) where people of different Python "runtimes" including CPython agreed that the best would be to write the stdlib in pure-python if possible, falling back to C-API only for serious speed needs or low-level access needs.

In the long term, do you expect to be easier to maintain a pure-python sys or a Go sys? For now, the pure-python code have 73 LOC. The CPython one have 1801 LOC. And I do not see a huge speed improvement on this path.

Another point is that, as we are borrowing PyPy and Ouroboros code, they could take it back and share the maintenance burden with us, but only if our modules are useful to him.

For now, my view of the thing is to implement whatever possible in pure-python (will be transpiled to Go anyway...), not worrying with implementation divergence from CPython. The result could be simpler and easier to maintain. And more useful to other pals too (PyPy for exemple).

alanjds commented 6 years ago

Comment by alanjds Tuesday Apr 25, 2017 at 15:12 GMT


Be clear that it is my philosophical perspective, more aligned with PyPy "view". But a view like "lets transpile CPython code to Go, whatever it is a good code or not" will sure work too, and will easily emulate the goods and bads there. Including maintenance ones.

alanjds commented 6 years ago

Comment by alanjds Tuesday Apr 25, 2017 at 15:19 GMT


Please confirm your spirits. I can wait or can give a shot, even if to change it later. just confirm.

alanjds commented 6 years ago

Comment by trotterdylan Tuesday Apr 25, 2017 at 15:53 GMT


I agree with the spirit of your argument, but our sys.py would not be portable Python anyway, since it has to use __go__ imports to function properly. The sys module is kind of special in that it is tightly integrated with the interpreter state (default encoding, sys.modules, stdin/out/err, etc.)

That said, I believe that with more advanced native integration we can achieve what we want here while keeping the sys module in Python. Option 2. above is a start in this direction. Once we can assign to Go package variables from Python, we could get rid of the SetStd() functions in the grumpy package and assign directly to the handles in the Go os package. That's a really nice end state because it means that Go and Python code will share the same std handles.

alanjds commented 6 years ago

Comment by alanjds Tuesday Apr 25, 2017 at 17:09 GMT


our sys.py would not be portable Python anyway, since it has to use go imports to function properly

PyPy solved it by having non-portable modules "native", something alike:

# sys.py (portable)
from _sys import stdout

# _sys.py (non-portable)
from __go__.os import Stdout
from __go__.grumpy import NewFileFromFD
stdout = NewFileFromFD(Stdout)

Now going on Option 2) variant...

Once we can assign to Go package variables from Python...

I do not see a good interface for it to happen. Go variables are not ObjectType instances. The far I can see is Grumpy package variables being assigned from Python, as you said.

import __go__.os as go_os
import __go__.grumpy.sys as grumpy_sys

go_os.Stdout = StringIO()   # Will never work I guess?
grumpy_sys.Stdout = StringIO()   # May work.

This way, everything in Grumpy (including Print()) should use grumpy.sys.Stdout, and users would overwrite grumpy.sys.Stdout. Still do not know how to proxy changes to sys.stdout to grumpy.sys.Stdout. Some hack would be needed.

alanjds commented 6 years ago

Comment by alanjds Wednesday Apr 26, 2017 at 01:54 GMT


For "some hack", I tried to overwrite setattr of a module in CPython and it let me but have no effect in attribution. Seems as a limitation of CPython that Grumpy could ignore and be more flexible.

alanjds commented 6 years ago

Comment by trotterdylan Wednesday Apr 26, 2017 at 04:08 GMT


Re: go_os.Stdout = StringIO()

Yes, this is a tricky case that you'd probably need to jump through hoops to make work with os.Pipe() or something like that. That perhaps creates more problems than it solves.

Re: __setattr__ for modules:

One approach that should work is to a) define a descriptor (i.e. a property) for std* attributes and b) swap out sys.modules['sys'] with a custom class. Something along these lines:

# sys.py
from __go__.grumpy import GetStdout, SetStdout, SysModules

class _SysModule(object):
  def _get_stdout(self):
    return GetStdout()
  def _set_stdout(self, f):
    SetStdout(f)
  stdout = property(_get_stdout, _set_stdout)

modules = SysModules
modules['sys'] = _SysModule()

Then basically everything in the sys module would have to live on the _SysModule class.

alanjds commented 6 years ago

Comment by alanjds Wednesday Apr 26, 2017 at 17:20 GMT


...And that is the hack we need to grumpy.sys map sys.py !!

But after attr_mode:rw got merged, GetStdout/SetStdout would become not needed. Or even better, maybe a __dict__ on grumpy.sys would allow this (and any) attributions transparently.

# sys.py
from __go__.grumpy import SysDict, SysModules

__all__ = ['every', 'func', 'implemented', 'in', 'python']

class _SysModule(object):
  def __init__(self):
    for k in __all__:
      SysDict[k] = locals()[k]
  def __setattr__(self, name, value):
    SysDict[name] = value
  def __getattr__(self, name):
    return SysDict.get(name)

modules = SysModules
modules['sys'] = _SysModule()
alanjds commented 6 years ago

Comment by trotterdylan Thursday Apr 27, 2017 at 03:06 GMT


Cool, yeah, something like that would work.

I wonder though whether this is getting more hacky than just implementing the whole thing in Go. The Python code in sys.py is pretty far from portable at that point.

alanjds commented 6 years ago

Comment by alanjds Thursday Apr 27, 2017 at 14:40 GMT


I could accept a 10-line hack for every hybrid module in the core... Maybe factored out to the importer if occur too often.

Will give a shot and you can decline the PR later if it gets too messy, ok?

alanjds commented 6 years ago

Comment by trotterdylan Thursday Apr 27, 2017 at 15:26 GMT


Sounds good. Looking forward to seeing how it turns out. On Thu, Apr 27, 2017 at 7:40 AM Alan Justino da Silva < notifications@github.com> wrote:

I could accept a 10-line hack for every hybrid module in the core...

Will give a shot and you can decline the PR later if it gets too messy, ok?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/google/grumpy/issues/290#issuecomment-297733523, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHLe56GYvdz0apbYZUuK690OBm4HbZDks5r0KjdgaJpZM4NFOm8 .

alanjds commented 6 years ago

Comment by alanjds Friday Apr 28, 2017 at 02:31 GMT


https://github.com/google/grumpy/blob/master/runtime/object.go#L127

Let me guess: __getattr__ and __setattr__ support are yet to be implemented.

Will open another issue/branch and freeze this one.

alanjds commented 6 years ago

Comment by trotterdylan Friday Apr 28, 2017 at 05:23 GMT


You are correct. Those have not yet been implemented. It should be pretty straightforward to add though.

alanjds commented 6 years ago

Comment by alanjds Friday Apr 28, 2017 at 07:56 GMT


(after send a WIP PR here)

alanjds commented 6 years ago

Comment by alanjds Friday Apr 28, 2017 at 07:57 GMT


Yes. Will just do in another branch/issue

alanjds commented 6 years ago

Comment by alanjds Saturday Apr 29, 2017 at 20:53 GMT


...and what is my surprise to discover that __getattribute__ is indeed implemented! Changed to use it instead of __getattr__ and it worked.

Now will continue the fix from the google/grumpy#302 base branch.

alanjds commented 6 years ago

Comment by alanjds Monday May 01, 2017 at 19:13 GMT


Fix proposal: google/grumpy#304.

Needs google/grumpy#302 (Replaceable sys.stdout via SysmoduleDict) merged first.