Split testing infrastructure off into a separate webpack bundle

schanzer commented 3 years ago

I am virtually certain that npm run build can be configured to generate two bundles.

Right now we have a ton of stuff in spec/support that is needed for testing. It used to be part of the main cmb bundle, which caused a ton of bloat since it couldn't be tree-shaken by any language modules that listed the bundle as a dependency.

So instead, this folder is duplicated across every other cmb-language module, because I can't figure out any other way to export it without including it in the bundle itself. This is a maintenance headache, and an obvious source of likely bugs down the road.

If it could be its own bundle, such that language-module developers can link to it but then ignore it when building their own cmb bundles, it would be a really nice win.

@pcardune you may already know how to make webpack work this magic?

pcardune commented 3 years ago

I think a lot has changed in the packaging/node/js ecosystem since all this was initially written, which might make distributing the CMB package as a webpack bundle completely unnecessary. Nowadays, javascript libraries, including front-end libraries, are packaged without bundling, and any bundling that happens is left to the applications using these packages.

What does the dependency hierarchy look like? AFAICT, pyret-blocks and wescheme-blocks depend on codemirror-blocks, and they each have their own webpack bundling config. But what packages/applications depend on pyret-blocks and wescheme-blocks?

Ideally, the only place where bundling should happen is at the very end of the dependency chain, along side the actual application that's going to serve up the static bundles to the web. Can you point me to the source code for those applications?

schanzer commented 3 years ago

Pyret-blocks and WeScheme-blocks are slated to be dependencies of CPO and WeScheme, respectively. WeScheme doesn't do any proper bundling, so for now it's just manually building and then dragging the .js file into WeScheme.

I think I misspoke earlier and used the wrong terms. Let me try again:

CMB used to include all of the testing infrastructure in the exported CMB module. So when a language module built its dependencies, it could use that infrastructure when running npm run test. However, when those modules were built for inclusion into IDEs, they'd drag that same infrastructure with them.

The solution we have now is to just copy those files into each language module, so they can do the testing without the bloat. But hopefully there's a better way?

pcardune commented 3 years ago

You are using the right terms. It just gets a bit confusing because there are a bunch of different (and conflicting) concerns that bundling attempts to address, and some of those concerns have been made irrelevant in the latest versions of ECMAScript.

Let me see if I can break down these concerns as it relates to the various IDEs and blocks projects.

pyret-blocks and wescheme-blocks both need to produce a single minified .min.js file in IIFE format that can be included on a webpage with a <script> tag because the pyret and wescheme IDEs don't have any javascript bundling system of their own. The bundle should only include code used by the IDEs and nothing else (like scripts/tests/etc.). This needs to be done with a bundler like webpack.
The karma test runner used by pyret-blocks, wescheme-blocks, and codemirror-blocks needs all the code being tested, and the tests themselves, to be transformed into something that will run in a browser (i.e. without JSX tags). This needs to be done with a karma preprocessor plugin, like karma-webpack.
pyret-blocks and wescheme-blocks need to reuse the common code that lives in the codemirror-blocks repository to provide their functionality. This needs to be done through a shared npm package.
pyret-blocks, wescheme-blocks, and codemirror-blocks all have an example/ directory with a website bundled and served via webpack to facilitate development of the packages in a browser.

AFAIK, these are the only concerns that need to be supported and are being actively relied upon. But there are more concerns being handled via webpack bundling:

codemirror-blocks produces its own .min.js file that can be included on a web page with a <script> tag, but AFAIK, this feature is not actually being used by anyone.
codemirror-blocks produces a single CodeMirrorBlocks.js file that exposes all the APIs needed to create a new language module while hiding the underlying directory/file layout of the codemirror-blocks repository and all the private implementation details like internal react components.
codemirror-blocks produces an npm package that contains only pure javascript by transpiling all the jsx files. This allows it to be depended on by other npm packages that don't know anything about react or about jsx syntax.

And finally, there is one last concern which is what this issue is about:

pyret-blocks and wescheme-blocks also need to reuse common testing utilities from the codemirror-blocks repo.

The confusion is that all these concerns are mushed together under a single moniker, "bundling", when in fact there's a whole lot of different stuff going on in which bundling is the means to an end but not the actual end in itself. When you use a single tool as the means to a dozen different ends, you end up with the "if all you have is a hammer, then everything looks like a nail" problem. Though I think in this case it's more like "if all you have is a swiss army knife, everything looks like ... something that can be fixed with a swiss army knife".

Anyway, the naive solution to [8] is to generate an additional webpack bundle with the common testing utilities like you suggest. But looking at the pyret-blocks and wesheme-blocks repos, this appears to be a small symptom of a bigger problem, which is that developing and maintaining language modules for codemirror-blocks is painful and error prone in general. Solving [8] in the naive way just kicks the can down the road a tiny bit. Which is fine, but perhaps we should have a discussion about the bigger picture at some point.

schanzer commented 3 years ago

Definitely open to a larger conversation. We've always had the option of making a separate NPM package just for testing, but I've seen other modules that appear to include submodules (e.g.- propTypes, react-dnd, etc), and was hoping there might be a more elegant solution here.

I'm also not clear on whether a separate module would even address the other issues you've laid out here. Happy to have this conversation in the thread, or on the phone if that's easier.

pcardune commented 3 years ago

It's been a couple of weeks and this issue has been brewing in my mind. I'll try to make a concrete proposal for how to move forward. But first, there's one more concern that I forgot about in my previous comment, which is:

The codemirror-blocks library needs to provide some non-javascript resource files including css, images, and audio files that the upstream application will ultimately serve to the browser alongside the javascript files. Currently, webpack just embeds these files into the monolithic bundle as data urls.

My concrete proposal for how to proceed is to create a minimal example package that showcases how the codemirror-blocks library is supposed to be used. wescheme-blocks and pyret-blocks have additional non-obvious (and unnecessary?) stuff going on that prevent them from being minimal enough. Instead, I'll just extract the code in "src/languages/example" into it's own example-blocks package, and add additional code that demonstrates the meta-features of the codemirror-blocks library: testing, bundling, and customization.

The first line of the codemirror-blocks README is: "CodeMirror-Blocks is not a block editor. It's a toolkit for building block editors." I originally had it in my head that codemirror-blocks was just a library for building block editors, and a library ideally has no business including or dictating application level concerns like how files get bundled or how code gets tested. So I thought it was weird for wescheme-blocks and pyret-blocks to depend on testing and bundling tools from codemirror-blocks. But really, like it says in the README, codemirror-blocks was always meant to be a toolkit, which to me is a collection of somewhat opinionated libraries and tools that you can optionally use to build something else. For example, react is a library, while create-react-app is a toolkit.

So what are the libraries and tools that codemirror-blocks wants to provide as part of its toolkit? I think they are the following:

"codemirror-blocks/core": a library that implements blockification and a11y features
"codemirror-blocks/ui": a library that provides some basic reusable ui components with default css styling
"codemirror-bocks/testing": a library and test runner for testing language specific features in a browser (it just wraps karma/jasmine)
"codemirror-blocks/build": a tool for building minified javascript bundles and running a development server to help test out whatever language specific features are being built (it just wraps webpack and babel)

The key thing about the above collection of tools/libraries is that they are separable and optional. You can use codemirror-blocks/build to generate the final minified js file... or you can use something else. You can use codemirror-blocks/testing to test your language specific features, or you can use something else. A specific language package like pyret-blocks or wescheme-blocks could choose to use all the components of the toolkit and not have to worry about configuring karma, babel, webpack, jasmine, etc. themselves. In fact, in this setup, they wouldn't even need to list karma, babel, webpack and the myriad other tooling libraries as dependencies because they would just be using them through codemirror-blocks. Meanwhile, other projects that already have their own bundling and testing infrastructure could still use the code from "codemirror-blocks/core" and just ignore the rest.

Anyway, to make a long story short, I'm going to create an example-blocks package, probably in a separate repository for the moment, that shows how one would create a new language module, including tests, bundling, and a dev server, all without copy/pasting any code or configuration from codemirror-blocks. At the same time, I'm going to create a new branch in this codemirror-blocks repo that will expose all this tooling as part of the codemirror-blocks package. This latter part might require some less trivial changes. If I manage to get all this working, and it looks like it will be easier to maintain than the current setup, then I can try updating wescheme-blocks and pyret-blocks to reuse all the common toolkit stuff.

What do you think?

schanzer commented 3 years ago

Whoa. This is really cool. And now that you mention it, I can see an immediate use-case for core without ui that I'd been overlooking before!

And can this refactoring actually be done in the same repo? Or does this mean we'd need to split CMB into 4 separate modules?

pcardune commented 3 years ago

There are three choices for how to split this things up:

separate repos and separate packages - this is a good option if development of each module happens in isolation with different maintainers, different release schedules, etc. but can become painful if the repos are more tightly coupled and are always developed in tandem with each other. The next version of codemirror is being developed like this and they have a bunch of custom tooling to keep repos in sync with each other.
one repo and separate packages - this is the "monorepo" method, which has been growing in popularity for a while. This is how babel maintains its numerous packages, and is good when you have a large number of packages that are all maintained, developed, and released together and by the same people, but where the users of the package(s) want to pick and choose exactly what packages they are going to use. Great for a plugin system!
one repo and one package - this is the monolithic library/framework method, which is the simplest of all three to maintain, develop, and release so long as the project stays relatively small. As a project gets bigger, this can become untenable, with git clone, npm install, and by extension, all CI builds, taking forever just so you can do something with a small fraction of the code. But we're a long ways away from that.

My plan is to go with option 3. One important thing to note is that just because all the code will still live in one package and one repo doesn't mean that all the code will be bundled into one monolithic javascript file that browsers are forced to load. Developers will have to download all the code to do development work, but the final application users will just download a small bundle without all the dev/test tools baked in.

schanzer commented 3 years ago

Back in the day, option 3 is exactly what we had. But my knowledge of webpack/npm is pretty limited, so the testing infrastructure wound up getting bundled into the final output of each language module. I had assumed that the only paths forward were 1 and 2, but 1 seems horrendously complex and I had no idea how to make 2 work.

I agree that Option 3 makes the most sense by far, if you can do it without the bundle bloating!

pcardune commented 3 years ago

Quick question: are the pyret-blocks and wescheme-blocks language modules in separate repos/packages for any particular reason? I can think of plenty of reasons to have them be separate, but I'm curious what the original motivation was.

schanzer commented 3 years ago

Once we started adding support for Pyret, it became a problem that the (still in development) Pyret tests would fail and cause Travis to start firing alarm bells. This problem would only get worse as more languages get added, so it made sense to split them off so language developers could focus only on THEIR languages, without worrying about breaking the build for anyone else.

(CM does this as well)

pcardune commented 3 years ago

Thought I'd post an update of where I am on this issue.

I've been making changes in the pcardune-toolkit branch and so far have the following features:

reusable karma test configuration
reusable development server with webpack
reusable bundling configuration
non-bundled, transpiled, lib directory for selective imports

I've also done some refactoring of wescheme-blocks repo to use these features. I don't have write access to that repo, so I've put the changes up in a fork here: https://github.com/pcardune/wescheme-blocks/tree/pcardune-toolkit

I also have a paired down example-blocks repo showcasing these features.

The remaining todo items before this can get merged into master are the following:

[ ] extract shared test-utils code into the toolkit
[ ] refactor the pyret-blocks repo to use this new stuff
[ ] verify that the bundles from pyret-blocks and wescheme-blocks still work in their intended environments and aren't including anything unnecessary

For that last todo I need to get a local dev environment setup for the IDEs that use pyret-blocks and wescheme-blocks. I got the wescheme.org code from https://github.com/bootstrapworld/wescheme running locally, but I'm not sure where the option to turn on the block editor lives? I tried the blocks-test branch, but got errors when trying to load the editor page: Uncaught ReferenceError: receiverE is not defined. Do you have any suggestions @schanzer on how to proceed here?

schanzer commented 3 years ago

You'll want the blocks-test branch, which requires only a CodeMirrorBlocks-min.js file and the corresponding CSS file to be dropped in from the wescheme-blocks dist/ directory.

I also gave you write access to both language modules, so you should be able to merge your changes in when they're ready.

pcardune commented 3 years ago

@schanzer The webpack config for generating the css file is commented out in wescheme-blocks. Does that mean it's not being used anymore and should be deleted?

schanzer commented 3 years ago

Ah yes - sorry, there was a change in the way webpack handles CSS stripping that caused the old extract-css plugin to stop working. I took a few runs at restoring the behavior and couldn't figure it out, and then read a few articles that made a plausible argument for keeping the CSS bundled into the JS. I think I would still like to have the CSS be a separate file, but between the headache and the arguments against it...I just gave in.

bootstrapworld / codemirror-blocks

Split testing infrastructure off into a separate webpack bundle #365