TranscryptOrg / Transcrypt

Python 3.9 to JavaScript compiler - Lean, fast, open!
https://www.transcrypt.org
Apache License 2.0
2.86k stars 215 forks source link

Streaming compiler #137

Closed icarito closed 8 years ago

icarito commented 8 years ago

In theory this would enable creating a loader for webpack and have hot-module-reloading.

One way to do it is to have a way to use stdin for input and/or output to stdout.

JdeH commented 8 years ago

Hi,

I have no webpack experience, but judging from your idea it sounds like this could be done with a wrapper, first streaming stdin to an input file(s) and then take the output file(s) and stream it to stdout. A possible complication would be that Transcrypt's input is in fact a module hierarchy, consisting of Python and JS-only modules, modules recursively importing other modules, using a combination of search paths, and its output are one .mod.js file per input module, a non-minified and a minified target file, sourcemaps and, depending on the switches, several other files, so I am not sure how those all would map to stdin and stdout.

My own focus is on the core of Transcrypt, which is in transition to further benefit from JS6 constructs and to anticipate on Python 3.6. However if minor modifications are needed to facilitate the use of a wrapper that could couple Transcrypt to webpack, I am willing to look into that. Maybe a wrapper can be written without modifications to Transcrypt. If this isn't the case, let me know what exactly would be needed to enable such a wrapper. Someone else would have to write the wrapper, though...

Kind regards Jacques de Hooge

icarito commented 8 years ago

Thanks for your thoughtful reply. I'm very excited about this project.

I love Python and have been struggling to convert to JS. Now I don't have to!

However I did enjoy some of the modern tooling JS has for web development - particularly hot-module-reloading - when I looked into using it, I found this difficulty. Here's how coffee-script does it: https://github.com/webpack/coffee-loader/blob/master/index.js - but I understand the challenges you mention, the most challenging being mapping module hierarchy to a string output, or a "simplified" mode.

Currently I'm using a makefile - this is currenty enough for my purposes but eventually I might come up with something. One issue with this approach is that the SourceMap reference is wrong (it's absolute path vs relative path).

JdeH commented 8 years ago

I am glad you like Transcrypt. It was born from my personal frustration with programming in JS. I especially missed true class based OO, multiple inheritance and operator overloading. These limitations are embedded so deeply in the design of JS that I didn't expect them to go away soon. I've worked on several JS libraries adding class based OO and multiple inheritance. But the syntax remained messy and at one point I got convinced that writing an efficient compiler would be easier than keep on improvising.

I have to agree that the tooling available for JS is often advanced. The JS world is big and many people are working on beautiful things there. I hope Transcrypt is a solid bridge that can be used both ways: Using JS libs in Python and writing libs for JS in Python.

Could you elaborate a bit on what exactly is the problem with the sourcemaps. Should they have a configurable absolute base path or something like that? Maybe an example would clarify what exactly is needed.

axgkl commented 8 years ago

@icarito Hi Sebastian. I have also not much experience with webpack, the only thing I remember was that there is this devmode URL where, when it rebuilds automatically on change informs the browser, connected via a websocket from this url, to reload the whole thing, right?

(So your requirement is not about having e.g. a single page app loaded and just hot reloading one module, while it keeps the state.)

The global app recompile and browser page reload on code change is rather straightforward to do, at least on mac or linux, don't know windows:

entr is a pretty performant change detection tool, which you can use to detect any source code change.

so:

find . -name '*.py' | entr sh -c \                                              
       'clear; killall python; /usr/bin/rm -f __javascript/* ;  run_transcrypt -nvv gui.py; sleep 0.5; $HOME/bin/reload_chrome.scpt || echo  -e "\e[48;5;196mCompilation Error !033[0m"'
~/swissscom/lgi_axc2 $ cat safari_reload.sh 
#!/usr/bin/env bash

osascript -e "
tell application \"Safari\"
    set docUrl to URL of document 1
    set URL of document 1 to docUrl
end tell
"

On linux there are other tools. If the 0.5 secs for the server startup are uncertain, then do this within another loop.

Hope this is useful. For my part I'm pretty glad that I don't need the 100s of megs of webpack pipeline tools whenever I need to quickly modify stuff on frontend servers in projects ;-) In fact I got rid of any js on the server, since most of the stuff can be done in the browser as well or using python, there are good tools around

icarito commented 8 years ago

To be honest I have little experience myself with these new generation javascript builders.

I have been looking to start a javascript heavy project and was looking into best practices as I have been frustrated (but successful) with it in the past.

In looking at React etc, there was mention of this hot-module-reloading feature thru either Webpack or Browserify, so I tried it and liked it a lot. The Webpack dev server will open your app in a different route with something like a debug frame. It will detect when a file is updated and compile selectively only required modules.

It will then send thru a WebSocket the recompiled module and update it in your running, live application. This is extremely fast and convenient, I was impressed.

I am interested in making an IDE or something like it for Transcrypt so that the process is as convenient as possible as I'd like to use it to teach children.

icarito commented 8 years ago

Here's a specification of a Browserify transform: https://github.com/substack/browserify-handbook#writing-your-own

The way I understand Browserify is that it compiles, compresses your javascript and even base64 encodes your images into your html if it makes sense.

Browserify allows to use the "require" statement to import resources in the source tree.

I suppose a major project would mix Python and Javascript and so I assume embracing the npm ecosystem would be inevitable, and these build tools are more advanced and versatile than loops and makefiles, also faster. Using "require" allows to avoid adding <script> tags manually.

@AXGKl Thanks for sharing, I am currently using a makefile myself and inotify on Linux to monitor the filesystem for changes.

axgkl commented 8 years ago

Hi,

ok this It will then send thru a WebSocket the recompiled module and update it in your running, live application.

is far more advanced then a full page reload on code change and does need module management within the browser so that only selective modules can be updated while keeping the state of the app in e.g. a Redux store.

Do I understand it correctly then, that you suggest Transcrypt not to compile into one monolithic js file but into separated modules (which e.g. could be piped to another process, streaming it to the client on changes?

Also: Do you have an issue with the compile time or with loosing the state in the app?

Because regarding the compile time, I switch off the minifier and then its pretty much instant, I mean as fast as I can turn my head to the second screen also a larger app is compiled (since the huge js libs are not touched anyway).

If its about the state: Thinking of another approach than trying to push Jacques into compiling into different modules (If I'm right that this would be needed, then): Looking at the Redux philosophy, wouldnt' it be enough to just offload all the state (serialized store content), e.g. to the server, download the new transcrypt source and refetch the state from local storage or server. That could work, a websocket connection to the python server would be required which is a no brainer with e.g. socketio and Transcrypt and Redux state should anyway be serializable...
Having such a permanent connection could even be useful for many production use cases as well, thinking since quite some time now about isomorphic components written in pure python, running on browser and server, sharing state....

PS1: I like your makefile, describing the full page reload method to you was in deed a waste of time and I should have read your issue twice ;-) PS2: the teaching children purpose will significantly prio-up your future requests on Jacques list...

icarito commented 8 years ago

@AXGKl > Do I understand it correctly then, that you suggest Transcrypt not to compile into one monolithic js file but into separated modules (which e.g. could be piped to another process, streaming it to the client on changes?

Yes this is what I am proposing. The section immediately before the one I linked explains the simple interface required.

PS: I didn't meant to use children to raise priorities :-)

axgkl commented 8 years ago

Hi. This one is really for Jacques or other people to decide, who are more sold on the build pipelines of the js world on the server side.

I'm personally still not so and I try to give my reasons. I have to say first, that in a project over the summer I was using a webpack to get my stuff preprocessed and bundled plus the dev mode ( did not know Transcrypt but was using RiotJs based components and the React/Redux flow to handle state). Although it was working and the dev server was sort of nice I got a bad feeling: We are a python/C and not so much a javascript shop. Whats the problem:

Take webpack. With the dev tools you have at least 270 megs of javascript alone for toolchain on your server, additionally then e.g. 200 further megs for babel on top. It all works nice and is certainly the way to go for frontend companies, who have to know their tools from the ground up - but what about supporting such huge toolchains in companies which are mainly python shops with python support and customers who expect that the stuff been deployed is being supported in 2, 3, 5 years... We have a hard time alone keeping up with the framework inflation in the js world - If we also have to be able to support ever varying build pipelines that would break us.
Trying to say: I personally take happily into account that my browser might reload when I hack code and I'm a bit slowed down by this - if on the other plus side I can settle on a python only toolchain on the server. Stability is a critical factor thats why we are on Python and we still can't forgive them to have broken backwards compat once in like 10 years with python3 (and that totally uneccessary) - but apart from that we and our customers trust that the stuff is not a constantly moving target.

=> Yes for the browser all the nice js libs can't be ignored and thats why Transcrypts approach to embrace and not trying to replace them is perfect.

On the server: I'm not sold. And yes I have 2 smart frontend guys in the company and they look at me like I told them the wall is purple when I say: No need for a js build pipeline on the server to make guis. Fact is though that I do it and it works and the customer where I recently replaced my summer project with Transcrypt and deleted all JS except the libs delivered to the client is happy. I have to also say (and thats maybe the reason for the avoidability in my case) that I switched to a professionally crafted library for the frontend widgets, via which I got rid of all CSS tooling and stuffs like minifying and uglification ...

Also I wanted to say here that I do remember the times of desperation due to two libs you need in one page, requiring two different versions of jquery. So I can understand why they did their highly redundant packaging design and the local namespacing browserify and npm deliver. But I personally don't see jquery conflicts anymore and the other libs up did not collide in recent years.


Having said that: What I really really like is the idea of require(<some js module>) in a Transcrypt module, avoiding all the script tags, replacing with a reasonably smart pull on demand. But on demand loading of js libs could be also done with a python server on the other side rather simple. Effort, but I think not too much. Btw: How would that work without the dev server? Then it would put it all into one big js bundle, rite? But what about http2 then, where the big bundle approach is then wrong again - then you need the server delivering libs on demand again, in production.

I personally think (and @pierrejean-coudert as I understood as well) that sooner or later Transcrypt needs a server companion but a python one. A module which can be imported by bottle, flask, django (...) and which has built in communication and state sharing functions which are sort of isomorphic, in that they run on server and client. If we have that I think dynamic lib shipment, based on require statements in Transcrypt code would then be really simple.


Again - purely my five cent. Others have other environments and prios and as long as I'm not forced to use the stuff I'm relaxed about any developments like ./transcrypt --compile-as-npm , it would for sure win more people over and thats whats counting in the end ;-)

JdeH commented 8 years ago

The whole subject of bundling, tool chains and hot updates is a complex one.

One of characteristics of current mainstream web development is that the way to do things in the JS world changes faster than I can even do one project for an industrial customer. Say a project lasts 2 years and I start out with the newest toolchain, libraries etc, then at the end what I've been using is probably frowned upon, since at that time new solutions are the standard.

While lots of energy in the web world goes to 'handy ways to do common tasks', I find it very time consuming to switch from new tool to even newer tool. What has been a constant factor in my career (even may years before the web) is the enormous amount of energy that goes to things that have to do with the process of making software rather than implementing the required functionality.

Python has proven to be an exception to this. Just like JavaScript it has a near to zero threshold:

print ('hello world')

but, different from JavaScript, I've made medical imaging, real time control and scientific computing applications with it that are as fast as their C++ counterparts, but much easier to maintain (doing some low level things in directly in C++, that is.)

I am aware that the web world for a large part revolves around maintaining ever changing 'shop windows' of companies, good looking and at minimum cost. For a typical customer a web site that as been on-line for three years is 'old'. New mobile devices pose new demands, new paradigms sprung up (single page site, mobile first etc.) and the customer wants to change so much that, apart from the database structure, he is in fact asking for a new website.

For this type of application, the ability to do simple things fast is essential to keep all this change affordable. When you're in a live discussion with a customer, constantly changing the looks and behavior of a site, the customer looking over your shoulder, tool chains, hot loading etc. are essential.

Another characteristic of web development is that, to make a living, one has to have a lot of customers, since making a web site is typically a small project compared to e.g. the imaging software, scientific computations and real time controls which constitute my main occupation. All of these projects currently have an Internet/web aspect and require a certain amount of front-end development, but I am used to projects that overall take years to complete and whose results have to be maintained as long as the hardware lives, so 30 years is no exception. In such circumstances a rather conservative approach, rock solid backup strategies, selecting tools that will not disappear in say 5 years etc. are all essential.

When on the other hand you maintain websites for 40 customers, it's nice if things like backups, version control and uploading are fast, automated and cheap. If I wanted to change site 27, I've long forgotten what exactly I did there, whether or not I used Twitter Bootstrap, Less or Google Fonts, and how it all comes together. I want to log in, take a look at a tiny part of the source code that matters, change it without hassle until it pleases the customer and then forget about it without bothering to much about the context. In 2 years the customer will want a complete overhaul of the site anyhow.

The difference between the 'high throughput' web world and the 'stability first' world of backend development is a given and all one can do is accept it. If a customer wants a website, I know he'll ask for the newest looks, that and that button, round corners or flat look, rasta colors or spring colors and this handy local menu with the three little bars on his iPhone. And I know the only way to achieve those in reasonable time is to use this and this new libraries. Never mind if they're still there in 5 years, who lives then, who cares then.

But in recent years something is changing. An ever larger share of websites aren't shopwindows anymore. I've designed some promotional material and merchandise for Transcrypt, and I am amazed about the sophistication of the on-line drawing tools. They are better than what I have on my desktop. One customer of mine needed an application to allow the customers designing his own building parts on-line. Applications like that are getting ever more complex and their lifecycles become longer. A part of the internet world seems to slow down a bit and gain depth.

I like working on that kind of applications, with an expected lifetime of say 10 years (admittedly embedded in a visual context that changes with every fashion). For these applications I couldn't do with JavaScript. The language itself changes its face rapidly to cater for the new needs, but in doing so it's getting very fragmented and complex, since it has to stay backward compatible.

Whereas the Python paradigm (originally) was: one obvious way to do something, in JavaScript there are many ways, and very counter-intuitive in many cases. Only the number of discussions on Stackoverflow on how on earth to distinguish a string from an integer or an integer from a float or the blessings of === as oposed to ==, make this very clear. These questions rarely have a satisfying answer. And if they had 3 years ago, this answer may be completely different now. How much 'market share' will the JS6 Map class steal away from the {...} object literal? Will that influence the typical JSON stream and configuration syntax? Or will object literals still be the norm there. And typed arrays? And fragments of asm.js in inner loops?

It is not that JavaScript is bad. It was developed for something rapidly changing: the internet. So it had to change rapidly itself. And while it's starting to look more and more like Python, it'll always remain JavaScript due to its legacy: prototypical single inheritance, context dependent type interpretation rather than dynamic assignment of a strict type, C-like syntax with lots of {}();.

But since long-lived, complex applications are making their way to the browser, there's room for something else. I hope Transcrypt can be this 'something else'.

While Transcrypt can be used anywhere that JavaScript can, except for exec and eval, it has a different focus. The typical Transcrypt application:

So it is an internally coherent, self-contained island of relative stability. The island has bridges and ferries connecting it to the outside world, of course, but it is a bit less volatile and fragmented than the typical JavaScript code on a webpage.

Seen in that light of this self-contained 'islandness', the first concern was how to make a Transcrypt application lean and fast. The current way of packaging a Transcrypt app is a direct consequence of that.

One way to go would have been to only load what you need, using some kind of sophisticated dynamic loading mechanism. The advantage is an even faster initial pageload. The downside is the unability to optimize the resulting JavaScript as a whole.

A completely different way to go is to take developments such as Google Closure very seriously. Closure already has some capability to strip out dead code, e.g. functions that are never called. This is a fine grained process, working at object and function level rather than at module level. Currently Closure can not yet minify the code generated by Trancrypt as good as I would wish to. In the higher compression level the correctness of the resulting code turned out to be compromised. Still, packaging all modules together allows a minification / dead code stripping / loop flattening / call elimination tool to do its job better. All occurences of a certain identifier can be interconnected, hence jointly shortened. All uses of a function or object can be found, hence in principle all dead code stripped.

This overall slimming down and speeding up of the end result is what Transcrypt seeks to enable. That doesn't mean that a dynamic module loading mechanism would never be useful, it could be, especially if all browsers gain a well documented, simple, library-independent way to cache such modules. If that happens, reconsideration is needed. And if someone comes up with a clever way to have it all, of course that's very useful. So I follow experiments in this area with interest.

JdeH commented 8 years ago

I'll close this, since I don't want to fuel the expectation that this will be implemented anytime soon. As said, it's appreciated that in the JS world hot reloading is valued. But, balancing costs and benefits, for TS I currently consider this a side issue.

KR J

icarito commented 6 years ago

Hi, A Webpack loader that may make this possible has just come to my attention: https://medium.com/@martim00/making-a-webpack-python-loader-87215d72292e

They jump around the outputting of files by concatenating them back together after the fact.

A bit hacky but I guess it should work!