go-python / gpython

gpython is a python interpreter written in go "batteries not included"
BSD 3-Clause "New" or "Revised" License
880 stars 94 forks source link

gPython's Killer App: Avoiding AsyncIO, but using Python Libraries #135

Open PythonLinks opened 4 years ago

PythonLinks commented 4 years ago

For a long time I have been interested in gPython. GoLang’s CSP just seems so easy to use compared to Python’s Asyincio ( coroutines). But Python has so many great libraries. They would be great together. Finally I found a mainstream application for this approach. A Discord bot.

So I brought up my first discord bot in Python quite painlessly. But it keeps crashing. Completely incomprehensible error message, let alone figuring out how all of this asyncio stuff works. I just want it to be easy to program. I could fight it, but better to put that energy into something constructive.

Sure, gPython is slower than cPython. I do not care. This is not a high traffic site. And if it ever does become a high traffic site, I can migrate it fully to golang. I just want it to be easy to program and debug.

I kind of like the idea of doing the primary message server in Go, and deciding which messages need extra processing, and doing that in Python. They do not even have to send a message back to Go, just call the discord web hook.

And of course Discord is a hot platform, which would help make this a hot technology.

What do you think about this idea?

sbinet commented 4 years ago

speaking only for myself, I do think it's a good idea.

I only know of Discord, but not really which protocols it relies on so people could write a bot. do you know what they are and thus what would be the minimum set of modules/classes/... one would have to implement on the gpython side to support that use-case?

PythonLinks commented 4 years ago

Great question. That is of course a critical question.

There are two parts. In Go, there are all the bot libraries that respond to messages. That should all work. No problem.

In my application, A message arrives. If it is a simple message, Go can respond. If it requires a complex process, specifically if it has a hashtag and a URL, they get passed to Python along with the name and id for the server, the channel, and the user. Maybe with a secret key as well.

Python then downloads the page, uses the webpreview library to extract the title, description and imageURL from the page.
Python then fetches the image. It can do those two steps sequentially.

If thee is an error during this process, I have two choices. Python could send a message back to Go to forward to discord, Or Python could call a Discord webhook. Marshal the arguments, and the secret key, and use the requests library to call Discord.

If it succeeds, Python then calls my database server to submit the data.

My database app, PythonLinks.info, or one of its siblings, then checks if the URL is already in the database. It checks if the hashtag is a valid category name, and if all is well it adds all of that data.

Whether or not there is a problem, my database app again calls a Discord web hook to notify the user of success or failure along with the appropriate message.

So the Python piece is quite generic Python. This approach matches what I am doing in Python, so that code worked yesterday, now exists in a separate file, and is being debugged today.

There is an interesting question when to use the mix of languages. GoLang is easier for Asynchronous stuff, but Python has better libraries. I do not quite know the Go libraries. Does Go have something like webpreview? What about the other libraries?

I also want to extract social media using Python’s html_to_etree parse_html_bytes and extract-social-media

And what about when I want to use NLP to process the message?

So at what point to I ditch cPython and start using gPython. I am not sure.
But having this discussion makes my decision much clearer.

Still I think that this set of applications, Discord bots, processing web crawls and NLP may well be the killer app for gPython.

PythonLinks commented 4 years ago

So here is the Googe htmlmeta library https://github.com/jonlaing/htmlmeta

And here is the Pyhton webpreview library https://github.com/ludbek/webpreview

Which also does Twitter Card and Schema.

So I think right off the start there is an advantage to using Go with Python.

But maybe I am better off with Go for the Async stuff, and just call python to process the data. I just need a single Python interpreter. Not multiple ones. Particularly cPython which would then support all cPython libraries. Much better. No real need for multiple Python interpreters. Better to have just one fully compatible one.

PythonLinks commented 4 years ago

Thank you for encouraging me to write up my plan. I new see when to use gPython. As soon as I start to fight with async bugs. Much better to put that energy into learning GoLang. Just in case my Discord bot ever needs to scale. At first I can call the needed Python libraries as external scripts. No risk there. When the load picks up, start using gPython.

PythonLinks commented 4 years ago

So now my basic GoLang discord bot is working. Infinitely easier than learning and debugging Asyncio. My question is what should I do next? The bot recognizes messages with links in them, and will do special processing in python on those messages. The easiest but computationally slowest is to just call an external Python Script. It has to fetch some json each time it is called. More difficult is to set up a socket for the GoLang and Python processes to talk to each other. Then I have to manage two long running processes. Simplest would be to use gPython, but that also appears riskiest to me.

Any advice? What is the critical question i should be asking? Has anyone else taken any of these paths?
I think that first I need to check if any of these libraries have similar libraries in Go, or require cPython.

Maybe I should do the external script first followed by gPython.

What do you all suggest?

raff commented 4 years ago

The main problem. of gPython at this point is that it may not have enough of the "standard library" implemented for what you want to do (or for what is needed for the libraries you want to import).

If you were to only use the python "grammar" and implement every external functionality in Go, you can probably use gPython. Basically you would use python as glue code / configuration code (similarly to what you would do with something like this: https://github.com/google/starlark-go)

PythonLinks commented 4 years ago

Thank you.

So the library I want to use is webpreview . Obviously it uses requests and beautiful soup. And beautiful soup requires the standard library. And therein lies the problem.

Do you know about Ourobouros? https://beeware.org/project/projects/libraries/ouroboros/

It is a Python implementation of the Python Standard library. This is my posting here, last August about Ourobouros . https://github.com/go-python/gpython/issues/68

So I do not expect webpreview and beautiful soup to run on gPython.

The big difference is I now know the goal. Discord bots are best written in GoLang rather than Python. (A day later my bot is still running, and that is the first time I ever wrote anything in Go). Webpreview and beautiful soup are great libraries, but they are in Python. So to run them on gPython requires Oroubouros.

If there are ever Python conferences, again, I think that makes for a great sprint.

And the Beeware guys are very active at Python sprints, and so would be happy to support such a project .

Even if I can never use gPython, I think that I made the right decision adding GoLang to my tool chest. I can run the Python libraries which I need in a separate process.