Closed s1na closed 4 years ago
I have over time shifted to a position where I think it makes sense to create some well-selected monorepo clusters to combine some repositories where there is lot's of interoperability and need for integration tests.
Cluster Candidates might be:
ethereumjs-vm
)Maybe in the future also some separation of the whole VM and just the execution engine.
merkle-patricia-tree
?)Thought about proposing a monorepo for the VM for some time now with the state manager and account repos, but maybe this makes more sense to cluster with the merkle tree lib like:
This would be more to have this in one place, not completely sure if the combination makes sense though:
Not sure, what is your general stance on this? And does these clusters make sense? Should very much be some longer discussion, would like to hears as many opinions as possible on this.
//cc @vpulim @mattdean-digicatapult @whymarrh @axic @alextsg @danjm @krzkaczor
I have the following four quick comments:
require('ethereumjs-wallet/thirdparty')
.Sorry for the slow reply @holgerd77.
I'm honestly not convinced by the mono-repo argument in this case. To me the best reason for a mono-repo structure is when you have modules that you might want to make separate modules at some point, but structurally you don't want to commit to a specific structure of modules early on in development.
I'm not sure I understand the testing argument here. There's nothing to stop us adding integration tests for some of the modules which have many dependencies and where coupling is maybe tighter than ideal.
On the specific ideas you've outlined
@mattdean-digicatapult Thanks for the comments.
I'd first like to point out that testing is not the only argument, there are some other in #27, and various others in the wild [1, 2].
Regarding testing: yes, it's possible to write integration tests for a library which tests it against its dependencies. What I was referring to was the reverse situation. E.g. a PR modifies merkle-patricia-tree
and all tests pass, but when released, it's found out that VM breaks when using the newest release. My argument is we need a method to test the dependents (not dependencies) of a library after modifications.
The goal with a mono-repo would not be to have a monolith library, but rather to break it into smaller packages (each published separately on npm), but they reside in the same git repository (E.g. truffle). I agree with you completely that VM needs a re-org, I just think the reorg doesn't need to introduce even more git repos, which complicates development, testing, cross-project changes, etc.
I've been a big fan of monorepos since I spent some time at Google, and convinced multiple projects I worked on to migrate to one, and the result has always been positive.
None of those migrations happened without some resistance, as it initially looks like a TERRIBLY BAD idea. The notion that a repo represents a single project/package is rooted very deeply in the software development folklore, and even some products (e.g. github) are somewhat developed around that.
A repository is just a tool for share files and coordinate changes to them. A project/package is part of the semantics of those files, so you are free to set them up as you want. There's no need to merge all the packages into a single one to have them in the same repo.
With that being clear, one should analyze the pros and cons of using a monorepo in the context of ethereumjs. These are the once I see:
Advantages:
Disadvantages:
Finally, I recently migrated Buidler to a monorepo with lerna, and I'm happy to share my experience doing that.
Hi @alcuadrado thanks for this additional write-up, for the moment we are relatively settled with our current setup. We had this monorepo discussion for quite some time now so please bear with me if I don't pick this up again for now.
We recently introduced the ethereumjs-config repo as some result of the discussion process. We will also likely use targeted monorepo approaches, a main one being the split up of the current VM modules (vm, evm, bloom, statemanager) into different packages (and run it as a lerna repo). This might be part of the v4 release process, but this is not decided yet.
We had this monorepo discussion for quite some time now so please bear with me if I don't pick this up again for now.
No problem. This is not a minor decision, and the migration would be a whole lot of work. Just wanted to share my opinion in case it could be useful.
We recently introduced the ethereumjs-config repo as some result of the discussion process.
I didn't know why nor when ethereumjs-config
was added. This makes sense.
Main monorepo transition has been decided upon and executed along https://github.com/ethereumjs/ethereumjs-vm/issues/561 with the creation of the VM monorepo. If other demands for one or more additional monorepo integrations occur it will likely be more productive and targeted to discuss along a newly created issue. I also regard my suggestions from the comment above and outdated to a greater extend (may still serve as some thought inspiration though).
Will close.
Due to the tight coupling of libraries, modifications in one could surface errors in other dependent libraries. Currently errors are found only after the library in question is released and other libraries update their dependencies. I think it'd help a lot if integration errors would surface during development and they'd become easier to find/fix.
Maintaining a monorepo of some of the highly interconnected libraries (VM, blockchain, block, etc.), as it is being discussed in https://github.com/ethereumjs/organization/issues/27, would enable running tests across libraries during development and would also facilitate cross-library changes.
Another approach would be to write a more complex circleci workflow which fetches all relevant libraries and tests them against each PR. Not sure how such a workflow would look like.
I'd be happy to hear your thoughts on these two approaches, or other potential ones.