google / grumpy

Grumpy is a Python to Go source code transcompiler and runtime.
Apache License 2.0
10.54k stars 648 forks source link

Make Grumpy compatible with standard python projects #43

Open kofalt opened 7 years ago

kofalt commented 7 years ago

There wasn't an explicit ticket for this, so here goes :grinning:

It would be great if Grumpy could digest a python project and spit out a Go project. I think this would make Grumpy a lot more usable in practice on existing codebases.

This means a couple things, off the top of my head:

Might be more, let me know what you think.

It seems to me that a ticket like this could use a concrete example and target. How about PyMySQL? PyMySQL is a popular, small (5.6k LOC) project that is explicitly pure-python. I had a gander through their import statements and everything seemed pretty reasonable. Notably, there are a few instances where they snag some small things from __future__, so that might need some implementation work. I only saw print_function and absolute_import which sounds straightforward.

I think this ticket would be satisfied by something like this:

git clone https://github.com/PyMySQL/PyMySQL

git clone https://github.com/google/grumpy
( cd grumpy; make )

grumpy/tools/grumpc --read PyMySQL --save $GOPATH/src/github.com/PyMySQL/PyMySQL

cd $GOPATH/src/github.com/PyMySQL/PyMySQL
go build

And now you have a Go library that you could import and use (it doesn't look like PyMySQL has a CLI). This is just an example though, maybe there's a better litmus test to work towards.

Relates #5, #11, #23, possibly others.

Thoughts? :smile:

trotterdylan commented 7 years ago

That's a very cool idea. I have two questions.

  1. What should Grumpy assume is in the PyMySQL dir? Should we assume subdirs with init.py are root packages and traverse them for all the .py files? That would seem to work for PyMySQL which has pymysql/__init__.py. Would it work for most libraries?

  2. Currently the Python import pymysql would look for the Go package grumpy/lib/pymysql. Thus your output directory $GOPATH/src/github.com/PyMySQL/PyMySQL won't work as you've described it without some changes. Does it make more sense to merge all the Go packages into one dir like grumpy/lib? Or should grumpc have some configuration so that when generating the .go files to map import pymysql to the Go package path github.com/PyMySQL/PyMySQL?

Another option would be to assume that the packages of interest are in the PYTHONPATH and then support the following:

grumpy/tools/grumpc --read pymysql

It could find the pymysql package by looking through the PYTHONPATH and then dump that into $GOPATH/src/grumpy/lib/pymysql. Maybe use $GOPATH/src/__py__ as the root dir by convention instead of grumpy/lib.

Are there other approaches we should consider? Is there a way that we could adopt something more like go build where it automatically pulls in dependencies?

ghost commented 7 years ago

Perhaps find already installed Python packages, grab and compile those, then bundle them together for projects that use external libraries specified in a requirements.txt? Then you could go build CLI projects that use things like twisted, etc. and quickly compile them into a native binary through transpiled Go code?

As far as the init.py problem goes, it's probably best to stick to Python's way of module searching to make as many libraries as possible compatible

wuttem commented 7 years ago

Maybe you could implement and use setuptools find_packages. In the case of pymysql and most other libraries with this approach you could use the setup.py file for exploring the library.

PYMSQL setup.py is something like:

from setuptools import setup, find_packages

setup(
    ...
    packages=find_packages()
    ...
)
trotterdylan commented 7 years ago

I've started to standardize a structure for Python projects under Grumpy. The basis for this is:

  1. Python files that are to be compiled under Grumpy should be staged in GOPATH/src/__python__
  2. A rudimentary tool called genmake can then traverse this path to generate a makefile for building Go packages
  3. grumpc will now search GOPATH/src/__python__ dirs for Python modules to import as Go packages in the generated source code

In principle, this means that you can pretty easily copy sources from a Python project and build and use the Go packages. Assuming you have a GOPATH set up and grumpy tools on the PATH, something like this could work:

git clone https://github.com/PyMySQL/PyMySQL
cp -R PyMySQL/pymysql $GOPATH/src/__python__
genmake $GOPATH > Makefile
make
echo 'import pymysql; print pymysql.VERSION' | grumprun

(Note: PyMySQL is still not supported, so this doesn't actually compile yet.)

Long term I'd like to wrap these steps into something simpler, along the lines of:

grumpy get https://github.com/PyMySQL/PyMySQL

But there are a bunch of steps to get there:

  1. Integrate the existing toolchain into a grumpy binary that supports a bunch of subcommands, much like the go binary.
  2. Implement grumpy genmake that wraps the current genmake tool.
  3. Support grumpy fetch that will grab Python files and import them into GOPATH/src/__python__. To start, it can just support directories and recurisvely copy .py files over.
  4. Wrap up grumpy fetch, grumpy genmake and the make step into a convenient grumpy get subcommand.
  5. Support for projects that use setuptools including finding sources within a project directory, determining dependencies, etc.

I'd also fold some of the other tools into the grumpy binary as subcommands including grumprun, grumpc and pydeps.

kofalt commented 7 years ago

That sounds awesome @trotterdylan!

cristim commented 6 years ago

The standard way to install packages in the python world is the pip tool, so I guess having a grumpy pip subcommand, as close to a drop-in replacement of pip would make more sense than a go-like construct like grumpy get:

grumpy pip install foo grumpy pip install foo[bar]

These should entirely implement the pip install workflow: