kaitai-io / kaitai_struct_webide

Online editor / visualizer for Kaitai Struct .ksy files
https://ide.kaitai.io
GNU General Public License v3.0
269 stars 61 forks source link

Proposal: Move to Parcel Bundler #93

Open jchv opened 4 years ago

jchv commented 4 years ago

tl;dr: I propose moving the build process to use Parcel Bundler and eliminating the Python 2 dependency entirely. The vendor system would be ported and function mostly just to generate the third party license file.

The case for Parcel Bundler

I'm starting to get acquainted with this codebase and one of the first things that occurred to me is the particularly unusual way that dependencies are handled. It looks like there's a few Python scripts that handle this, among some other things like generating the third party licenses (pretty cool, btw.) I find the usage of Python 2 for this task pretty quaint. After all, Node.JS is already required, and I don't really see why any of the things couldn't be done equally as good or better with JS (perhaps I'm missing something?)

On the better side, particularly, bundling. Many modern JS and TypeScript projects have adopted bundlers. Bundlers have many advantages, perhaps most importantly allowing you to use modules (including ES modules) in a very natural way, while outputting just a single JavaScript bundle. Even more, bundlers since Browserify generally support bundling dependencies directly from node_modules, freeing you from the minutiae of how to include and use modules, since you can just npm add and require right away as if you're in Node. Bundling is also an ideal point to apply compilation steps, minification, while supporting source mapping, hot reloading, and many other nice development features.

The most popular bundler is certainly Webpack, and I have quite a lot of experience with it in production. It can do pretty much literally anything, but the downside is that the configuration is downright arcane at times, and too much information is encoded in your Webpack configuration. Webpack configurations tend to grow virtually without limit if careful attention isn't paid. Therefore, I suggest not using Webpack and opting for Parcel Bundler instead.

The defining feature of Parcel Bundler is that it has no configuration file. Instead, you can control what few switches it does have via the CLI, and the rest of the configuration will be automatically pulled from standard configuration files like tsconfig.json. Parcel also supports the usual set of features you'd expect in a modern frontend development tool: Live HMR (Hot Module Reloading), compilers and transpilers (Automatic support for TypeScript, various CSS preprocessors, even compiling Rust to WASM on the fly!), source maps, and code splitting (useful if giant bundles are a sticking point for adoption.) And it supports dependencies like webfonts just fine, for what that's worth. The ecosystem for Parcel is good, even if it's not as strong as Webpack in all aspects. For example, Monaco Editor supports Parcel.

I find Parcel to be both elegant and effective in its operation. You can feed it an HTML file, like:

<!DOCTYPE html>
<html>
<head>
    <title>Demo</title>
    <link rel="stylesheet" href="./styles.sass">
</head>
<body>
    <script src="./app.ts"></script>
</body>

And it will:

So you end up with a dist directory that looks something like:

/dist/index.html
/dist/app.5793fd45.js
/dist/app.5793fd45.js.map
/dist/styles.e750914d.css
/dist/styles.e750914d.css.map

...which of course is suitable to upload to GitHub Pages, Netlify, any old HTTP server, etc.

One advantage of this approach is that you can enable long-term caching for everything other than index.html, since the filenames will always change when the code changes. No need for other weird hacks.

What about vendor_build.py, vendor_license.py, etc?

I propose rewriting the existing Python scripts to JavaScript. It would impose no additional requirements since NPM and NodeJS are needed anyways.

build.py would ideally be obsoleted. I am not 100% certain that everything it does can be done exactly the same way with a bundler, but a close-enough equivalent should be possible for all of it. Parcel supports pulling environment variables into the build trivially by accessing them as process.env.VAR. Properly embedding the git build information may require some forethought. If it's acceptable, I'd propose pulling Git information in as environment variables before starting Parcel, or something to that effect.

genKaitaiFsFiles.py: This one should also be pretty easy to do. Going the extra mile, though, in theory Kaitai filesystem bundles could be implemented as a Parcel Bundler plugin, implementing what is called an Asset Type. That would allow it to benefit from the same auto-reloading that the rest of Parcel offers, among other things.

practice/chall1/geninput.py: No idea. I guess this is related to serve_practice.py. Maybe this is not important to keep?

vendor_build.py would be made obsolete, but as a transition step it would probably be doable to rewrite the current vendoring in JS/TS anyways.

vendor_license.py is interesting. Bundling does not really change the license problem, so I propose keeping this. vendor.yaml would no longer need a files, distDir, or npmDir attribute, but the remaining metadata could be kept.

serve.py: Should be obsoleted by Parcel's built-in server. The built-in server offers most of the development amenities you could hope for.

serve_files.py: This should not be too difficult to implement - Parcel exposes an Express middleware that you can integrate their server with your own scripts. In this case, it may be smart to just build a single serve.ts/serve.js that supports all of the different features, with flags to enable/disable them.

serve_practice.py: To be honest, I have no idea what's going on with this file, so I can't really speak for that one, but it doesn't seem to be doing anything terribly hard to reproduce at first glance.

What would need to be done?

We would probably need some new packages. At minimum:

The TypeScript bits would need to be reworked a bit. Notably:

Some adjustments to IDE configurations and continuous integration will be necessary. Otherwise, I anticipate most of this should not require too much work.

Who will do this work?

I have had difficulty contributing to Kaitai Struct in the past (particularly compiler) as my lack of experience with Scala has really made it difficult for me to do meaningful work. However, Web IDE plays much more to my personal strengths, being a primarily TypeScript project. Therefore, I feel I can commit to doing this work.

Final Words

There are many, many, many ways to build HTML/JS apps. Bundlers like Webpack, Browserify, Rollup, Parcel Bundler. Pipeline tools like Gulp. Build systems like Grunt, Brunch. Shell scripts, Python scripts, Javascript... scripts. I've had experience with many of these approaches in the past and I think bundling comes out far ahead, with features like HMR, and ecosystem support, since bundlers offer compatibility with CommonJS, AMD, UMD, and ES Modules simultaneously. I believe bundling is the right model for compiling JS apps, and Parcel does it with the least amount of fuss that I've ever seen.

I'd like to help move this project forward, and I believe the best start to doing that is to start with the foundation and work up the stack. And to that effect, aligning Kaitai Struct WebIDE with the latest and hopefully greatest frontend best practices should make it easier to take advantage of the rich ecosystem.

I also believe removing the dependency on Python 2 would be a huge benefit. Here on NixOS, Python dependencies are a lot more of a pain to deal with than Node.

I'm also mindful that there are likely good reasons why Kaitai Struct WebIDE was designed the way it was, and that I might not understand those reasons. I am hoping that my perspective is not missing any very important bits and that a consensus can be reached on the path forward even if it is not exactly the one outlined here.

Apologies for the very large amount of text. I could probably reduce and refine a lot of these thoughts, but I want to get feedback as soon as possible, so perhaps I can do some refinements as future edits.

koczkatamas commented 4 years ago

Hello John,

I am/was the main developer of the WebIDE but I did not contributed lately.

I don't see any reason why not to switch to NodeJS from Python for the build scripts (travis CI build scripts may need be modified too!)

The practice files are for a more or less hidden, abandoned practice mode which gives you tasks to create an expected ksy file for a given input file and a textual structure description. It was also used by the Avatao platform: https://blog.avatao.com/Kaitai/

You can remove these practice files for now, we will recreate them if we want to support this mode in the future.

Switching to Parcel Bundler is a bigger change and can cause not yet seen issues with the dependencies. I say you should try the migration and see what kind of issues arise..

One of the reasons why I did not use bundlers like WebPack because in this structure, because the development was really rapid in the terms that after modifying the code the whole WebIDE could reload in ~1-2 seconds and see the results. The hot module reloading did not work too reliably (disclaimer: maybe I did not configure it correctly back then).

Other requirements of mine is to be able to debug the compiled, but not minimized ES6+ code (which is almost TypeScript), without sourcemaps which was again unreliable.

If these requirements can be fulfilled with Parcel and/or the benefits are much greater than the current situation then I more than glad to switch a modern bundler system.

KOLANICH commented 4 years ago

IMHO we should just drop any bundlers (and all other node.js-based tools, if we can) and use ECMAScript modules.

jchv commented 4 years ago

@koczkatamas

The practice files are for a more or less hidden, abandoned practice mode which gives you tasks to create an expected ksy file for a given input file and a textual structure description. It was also used by the Avatao platform: https://blog.avatao.com/Kaitai/

You can remove these practice files for now, we will recreate them if we want to support this mode in the future.

Thanks for the explanation.

Switching to Parcel Bundler is a bigger change and can cause not yet seen issues with the dependencies. I say you should try the migration and see what kind of issues arise..

I'm also curious to see what kinds of issues will arise, especially with things like Ace, but I am hopeful at least. Many of these libraries I know for sure will bundle just fine. I will find out soon, I'm sure.

One of the reasons why I did not use bundlers like WebPack because in this structure, because the development was really rapid in the terms that after modifying the code the whole WebIDE could reload in ~1-2 seconds and see the results. The hot module reloading did not work too reliably (disclaimer: maybe I did not configure it correctly back then).

Parcel HMR works without configuration in many cases. It's worth noting that true HMR will usually only work if you are using a framework that supports it (Supposedly Vue does, so if that gets adopted in v2 it should be possible) or write code to accept HMR updates directly. But ordinary live reloading (that refreshes the page) certainly works out of the box, and it is pretty fast. Parcel does do true hot reload for some other asset types as well (like CSS.)

Parcel's server does incremental builds with caching by default. Generally, rebuilds only take O(milliseconds) rather than O(seconds). The full build process would probably take longer than just running tsc, but the incremental one should be similar to tsc watch.

Other requirements of mine is to be able to debug the compiled, but not minimized ES6+ code (which is almost TypeScript), without sourcemaps which was again unreliable.

If these requirements can be fulfilled with Parcel and/or the benefits are much greater than the current situation then I more than glad to switch a modern bundler system.

Of course, I'm not sure that this is all possible to fulfill. I am, however, willing to do a bunch of throwaway work to find out.

I'm not sure what would work to satisfy the "unminified ES6" bit. I think Parcel in dev mode already doesn't minify, but no option will cause it to not bundle into a single file, somewhat obviously.

@KOLANICH

IMHO we should just drop any bundlers (and all other node.js-based tools, if we can) and use ECMAScript modules.

Yeah, no doubt, ECMAScript modules are really the right way to go. However:

I think the closest we can come to supporting your vision would be to use JSPM to use transformed ES modules directly from NPM. Although it's a pretty cool system, in practice it's always been kind of a mess, and I think you still need SystemJS to be able to use it effectively. I would lean against for now. It can always be moved toward in the future.

Dropping all Node.JS based tools is not going to be easily possible. Browsers don't natively support TypeScript. You COULD depend on dynamically loading TypeScript compiler in the browser, somehow integrating that with your module loader, then load all of your modules directly from JSPM CDN. I don't particularly like this idea, but it's possible to do.

GreyCat commented 4 years ago

As a person who spent considerable amount of time trying to get web IDE to build (and, ultimately, failing so far with integrating https://github.com/kaitai-io/kaitai_struct_webide/pull/84 and fixing issues like https://github.com/kaitai-io/kaitai_struct_webide/issues/88), I couldn't be happier if anybody could introduce a more standard build system than the existing 2.5 homebrew ones.

From my standpoint, I don't see much difference between webpack / vue-cli / Parcel, as long as we'll clearly have a single, standard system, not a ton of assorted shell scripts / Python scripts / etc, with no clear documentation and lots of confusing internal steps & dependencies.