The Case for Continuous Releases and a Monorepo

fkleedorfer commented 1 year ago

Taking up the discussion from https://github.com/IFCjs/web-ifc/issues/294

Not long ago, a new version of web-ifc-viewer was published. Thanks for doing that!!

However, it took two months to bump the web-ifc dependency , with many fixes we have been waiting for, and at least some new bugs that we are discovering only now that we can try to upgrade.

Bumping our IFC.js dependencies might be harder for us than for most as we use it in different situations (with and without GUI, and we make modifications to the model and write it). However, I suppose it is hard for everyone because so much has changed.

I know you don't owe us anything and we are glad about the work you do. However, I believe the way that the projects, builds and releases are set up constitute a tight bottleneck, preventing the project to scale the way it could, given developer interest. I am responding to the points made in #294. Let's see if they convince you. If they do, we would be more than happy to help with an alternative setup.

First off: The problems we face now:

rare releases with massive amounts of changes, making upgrading hard (and maybe not an option at all). Virtually no documentation of how to upgrade.
our workarounds for IFC.js bugs hanging around far too long
very long feedback cycle through the IFC.js user community at least for some code (web-ifc) because of rare and staggered releases - no matter how trivial the bug. Also due to this turnaround time bug reports for web-ifc that could have been made months ago start trickling in now
dependency conflicts between different IFC.js projects (web-ifc-three/web-ifc-viewer)
a great difference between released code that is discussed on discord and the current state of the repos, hence the discussion gets lost between as-is and will-be.
a build process so difficult and differing across repos that few people seem to care to try

I believe all of this is due to the main projects being maintained and built independently of each other, even though they are quite tightly coupled, and there is no reason to integrate changes to a dependency immediately, which leads to a situation in which, in large intervals, a huge release gets thrown over the fence that the downstream devs then have to chew on for quite some time. Maybe there is also the Second System Syndrome with respect to IFC.js/components going on, taking up development time. Everyone has great hopes for this... but is it ready for prime time, or will it ever be?

As stated in #294, I would suggest migrating at least web-ifc, web-ifc-three, and web-ifc-viewer, and probably also the components project into a monorepo and set up continuous releases through github actions.

As to the criticism you formulate above with respect to this approach:

@agviegas said

The complexity of compiling web-ifc. The only way to compile web-ifc is using emscripten or docker, and there are a lot of users who don't know how to use either technology. This means that forcing them to download web-ifc would force us to put together a piecemeal compilation that would make everything much more complex.

I believe it should be possible to set up a lerna project in such a way that you only build what you change, and pull the rest from the nightly builds repo. That would actually make it easier, not harder, to build one's own modules, for example, to fix a bug or add a feature.

I also believe that building web-ifc can be done on github through an action. I would suggest disecting web-ifc even more such that most of it can be built easily, and the hard part only needs to be touched by the developers who work on or tightly with the web assembly part.

Also, to the point of complexity: you've already got a massively complex system on your hands - this is precisely the problem. What I believe is needed is to refactor it so it consists of simpler building blocks that are easier to keep in sync with each other.

Code size. A lot of people just want web-ifc, and forcing them to download everything we've done seems illogical, especially assuming that this year the code size is going to multiply, with new features and tools.

This concerns only contributors. Someone who wants to work on web-ifc, with all its complexity of building, will not be scared away by having to pull in other modules. And maybe it would be for the better for these developers, too as they might inadvertedly break dependent modules with their changes and find out about it while they work on their PR?

Speed of code change. Although the releases may seem few, the amount of changes that occur are enormous. The fundamental reason is that IFC.js is not trying to solve a bounded problem, so once the code is "finished", it only needs to be maintained and small bugs need to be fixed. There is a continuous research, improvement and input of new ideas to replace old ones. For example, look at web-ifc-three and web-ifc-viewer: these are technologies that work well for IFCs up to 500mb on PCs, but what about larger files or mobile devices? We are now developing BIM tiles, a new technology that is able to open them, but this is different than web-ifc-three and web-ifc-viewer, and is likely to eventually replace it, which would create big deprecation problems in that merged repository. The same would happen if tomorrow we discover another geometry api that allows us to model in 3D.

When I think about the speed of code change, a monorepo in my mind, makes even more sense: If you want to replace a big portion of the code, it is much easier to do so in a controlled manner as you can immediately see how your changes break other modules. You can release anytime (including consistent nightly builds, or even after each PR merge to any of the modules) so you can shorten the feedback cycle with the community. Also, such short release cycle will provide an additional incentive for developers to fork and fix bugs themselves. Bounties are great, but many just want the stuff to work - but they don't see the point of contributing if they have to wait for months to see if their PR actually fixes their problem.

@beachtom said

There are different people working on each repo - and they can be used in isolation. i.e. if you want to do command line no processing you can just use this repo. I am happy to be convinced - but to me the problems of going mono-repo do outweigh the advantages - at least to me at the moment

I think that this is also an argument for, not against a monorepo. Especially if different teams work on different modules that depend on each other, seeing each other's changes and integrating them as they are merged instead of as they are relased will help avoid duplication, divergence and finding bugs early.

Lastly, I think it would be strategically important to make this change: As far as I can tell, IFC.js has won the winner-take-all open source competition for an IFC library in JS. Everyone is rooting for you. A lot of people are ready to chime in. You want to scale. Your bounty program is proof for that. However, your project setup is a serious obstacle to growth and scaling that would otherwise take place. If you open up your build system to allow for continuous releases, you won't regret it. Let's do this!

fkleedorfer commented 1 year ago

Next steps?

In order to make the suggestions in #294, back in January, I made a prototype of the monorepo. It needed more work at the time - maybe a few days. When done, one could try it and weigh pros and cons of an actual system instead of each person's mental model of it.

It seems like a waste of time to finish the prototype unless the core team would at least consider a switch if it works well. However, if the core team is willing to consider it, I'll be more than happy to do the work, and even more delighted if others wanted to contribute to this effort.

Making the monorepo requires some changes to the existing sources (mostly to the build system, but also to package names to avoid clashes), so we would have to start from scratch with the current sources.

There will never be a good time to do this - which means the best time to do it is always: now.

beachtom commented 1 year ago

I do see the points about a mono repo - I guess an advantage is to provide a CI that allows us to check a web-ifc change does not affect web-ifc-three.

I guess the main concern from my perspective would be the web-ifc elements have a much more demanding toolchain to build. We'd need to be able to allow people to develop on web-ifc-three - without needing to necessarily build - web-ifc

fkleedorfer commented 1 year ago

I very much agree with this. I am not positive that lerna can be set up for partial builds of the monorepo, but I strongly suspect it - it is being used for repos with hundreds of packages at places like microsoft, I hear, so that would be an important feature.

It would mean, though, that you might not be willing/able to build a module dependent on the one you are currently modifying, and thus you would not see that your change breaks it. A github action would need to be set up to build the whole monorepo for each PR, and fail if your change breaks a dependent module. That would take longer for you as the developer to notice, but it should work.

beachtom commented 1 year ago

I believe there was a discussion on discord about this - so I am going to close it for now

fkleedorfer commented 1 year ago

I believe there was a discussion on discord about this - so I am going to close it for now

True, there was. The gist is: you got this, it's no problem. Things are going to work out fine, with monthly releases and probably more frequent bugfix releases. I am happy to hear that! I understand that the help we offered will not be required, which is a relief, too.

fkleedorfer commented 1 year ago

For future reference, here is an explanation (that I haven't tried yet) on how to set up lerna such that hard-to-build packages need not be built locally in a monorepo: https://github.com/lerna/lerna/discussions/3666#discussioncomment-5797174

ThatOpen / engine_web-ifc

The Case for Continuous Releases and a Monorepo #385