Open blahah opened 7 years ago
Just a suggestion but I think the most important thing to break out might placing a custom Lens build under the sciencefair GitHub or NPM. I was going to take a crack at #61 but that basically means hacking a compiled file - Lens Starter makes it way easier to add panels and converters which will be necessary as ScienceFair grows in scope.
I'd be happy to put together a proof-of-concept if you're interested.
@CAYdenberg agreed! I actually did this before, but some bugs became easier to fix in the compiled file vs recompiling and re-releasing all the time. It's not sustainable though, as we'll need to be able to incorporate new converters for new XML sources. Could you create a new issue where we can discuss? A proof of concept for the lens build system would be very welcome :)
Hi,
I'm concerned about how this change will effect development. Is it the intention that the modules
will be published on npm
and required
into the application? If this is the case it will make modifying the modules
more difficult because one would likely edit the package
in node_modules
to run the application with changes. Given that you were describing atomic
modules I imagine that this may be a common situation.
To contribute changes developers may end up pulling multiple repositories from git
into the same folder, ie all equal depth with the science fair folder -- I think that's the approach I would take. From there would it be advisable for the developer to symlink
these application module
folders with their counterpart in the sciencefair node_modules
?
I can't help but wonder if you've considered a mono repo. ckeditor5 is an example of this type of architecture and they share their development workflow. I feel this is the direction you may be wanting to head. lerna and mgit2 are appropriate tools.
I do not think the substack post is entirely relevant to this issue, because I think they were writing about publishing utility modules to npm versus feature modules as in this issue -- ScienceFair specific classes (e.g. Paper and Datasource)
As more modules are published to npm I expect I won't need to write so many modules but there will always be room for new stuff. line 64
I think either approach has it's drawbacks and would benefit from documentation -- developer guide?
@slmyers thanks for raising this.
Is it the intention that the modules will be published on
npm
andrequired
into the application?
Yes
can't help but wonder if you've considered a mono repo.
I have thought about it (a lot!) - and it's useful to have this opportunity to record the reasoning.
My long experience of both approaches (hypermodular vs monorepo) across many projects is that monorepos are a nightmare to work with at even the scale of 10 or so packages, and very difficult to contribute small changes to from outside the project. Tools like lerna for managing them are flaky and have poor error handling (e.g. https://github.com/lerna/lerna/issues/524#issuecomment-299264115
, not linked because it's not productive for them to be notified to read this) - there is essentially no good tooling for such systems (that I have found), and they waste a lot of time on admin. They also give the appearance of a controlled ecosystem which is philosophically the opposite of what we want to achieve.
The small modules approach is much better tooled and easier to handle in development imo - and when we get to that stage I will document my own preferred workflow but there are many. One simple flow is already built into npm
and yarn
- to work on a package you do:
cd /path/to/small-module
npm link
cd /path/to/app-dir
npm link small-module
Then you can work in /path/to/small-module
and see the changes reflected in the app. When you are done, you do rm -rf node_modules/small-module && npm install
in the app repo to return to the published module dependency. See atom for an example of a hugely successful project that follows this development workflow.
It's also much easier to use forks of parts of the system this way, or replace them altogether. With an isolated module (e.g. sciencefair-land/sciencefair-paper
) you can just fork that repo, then npm install your-name/sciencefair-paper
and you can depend on your own fork using github. Doing the same with a monorepo is not currently possible afaik, especially because monorepo tooling often templates out parts of the package.json
.
I think the substack post still applies here. As pointed out in the why modularise? part of the issue, classes like Paper
and Datasource
will be used across many tools - not just ScienceFair the app but also tools for producing and managing datasources, web tools, and developer utilities. They are also intended to be swappable by v2. Even if they weren't, having them isolated makes contributing easier (based on my experience and feedback from contributors on other projects), and makes isolated bug fixing and testing faster to do and easier to reason about.
would benefit from documentation -- developer guide?
I agree with this part completely. Good developer documentation, when we get to that stage, will be a priority. Simple developer tooling will also help. For example, switching a dependency for a local directory and back again should be one command each (scripts wrapping the series of npm
or yarn
commands). We should start planning these things soon.
Another trick with a lot of modules is to do development in a folder named node_modules
, e.g. you can set up paths like this:
~/code/node_modules/science-fair
~/code/node_modules/small-module
This allows you to use small-module
from science-fair
without npm link
! npm
looks up the tree for any node_modules
folder and takes the first module it finds.
We wrote about this workflow a bit in the Dat contributing guide, but let me know if we need to clarify anything!
@joehand omg, I've only ever had a node_modules
in a parent folder be a source of bugs because it was there accidentally. Can't believe I never thought of this, great shout! Thanks :)
Also that's a good point - I think there will be considerable overlap in the dat and sciencefair contributing materials so we should look into a shared resource where possible (cc @maxogden)
@blahah To echo @joehand, a manually managed node_modules
folder placed in a subdirectory is an effective solution. We use it extensively in stdlib with support for module decomposition to great success, thus allowing us to enjoy the benefits of both a monorepo and hypermodularity.
@kgryte interesting - this is actually the inverse of what @joehand was suggesting I think - in stdlib
the node_modules is a subdirectory - in dat
development it's a parent directory:
dat
node_modules/
├── dat
│ └── node_modules
│ └── module_3
├── module_1
└── module_2
stdlib
stdlib/
└── lib
└── node_modules
├── module_1
├── module_2
└── module_3
Is this interpretation correct? If so, I'm not sure I understand how this is different than a monorepo (for example, I can't npm install
a fork of a specific module from stdlib
without publishing it on npm).
I am not clear on the dat
approach. I have read and reread the contributing guide, and I can read it both ways.
What I can say for stdlib
is that you can install a fork. The way we have setup the project is that we are able to decompose (using custom tooling) the entire project into individual packages (resolving all dependencies in a manner similar to browserify
) and pushing each package to its own separate GitHub repository. From each separate repository, we are able to publish a package independently of the rest of the project. In which case, consumers can effectively build their own "stdlib" from the individual components. This is what I meant by having both worlds: we develop in a monorepo, allowing centralized development, while publishing as individual packages and repositories, allowing people to fork, clone, and combine individual components as needed.
And now rereading this thread, the one use case not supported by the stdlib
approach is where you want to fork and then recombine in the same project. Meaning, I cannot (easily) modify a local copy of stdlib
to use a forked version of an internal project package. However, in this case, I would say to simply create a new branch for the main repo containing the modified code.
Note: decomposing the project into separate repos is not live. We have proven the concept, but not flipped the switch, as our namespace is still in flux.
@kgryte thanks, it was the separate repos part that was missing from my understanding, which makes sense if it's not live! Will you achieve it using git submodules?
p.s. @joehand perhaps the dat guide needs some clarification if it can be read both ways? 👆
@blahah For stdlib
, separate repositories are for consumption, not development. Meaning, we publish separate repositories so consumers can fork and modify individual aspects of the project without needing to setup the entire development environment and pull down the entire codebase. If you want to contribute to stdlib
, you need to contribute to the monorepo.
From the project's standpoint, publishing a repository is similar to publishing a package to npm
: a repository is an end product, not a development feature.
@kgryte understood, but I am trying to understand how you will incorporate the repositories for each module into the main repo? or is it that you do all development on the code inside the monorepo, then have a script that takes each module in lib
and pushes it to its own repo as well?
@blahah Do all development inside monorepo, and we have a script which builds the repository for each individual package.
@kgryte thanks for explaining :)
So, I think the long and short of all this is that:
node
, electron
and npm
behaviour)@blahah Yeah, sorry if I muddied the water to begin with. What you outline seems reasonable, and I will be curious to see how things progress.
Not at all @kgryte - it's non-trivial so having a back-and forth to get the detail is useful. I have never seen a project, afaik, that has forkable modules as part of a monorepo so it's very cool to have the stdlib example explained.
mono repos
@slmyers @blahah on mono repos: In https://github.com/datproject/dat/issues/824#issuecomment-315740501 I presented vert.x Have a look at vert-x3: one github org containing all official modules I would argue vert.x has all the benefits of the mono repo, without most of the hassle.
modularisation approach
I find you are having a too technical mindset wrt modularisation, as @slmyers rightfully states:
I do not think the substack post is entirely relevant to this issue, because I think they were writing about publishing utility modules to npm versus feature modules as in this issue -- ScienceFair specific classes (e.g. Paper and Datasource)
True, you will elicit modules according to technical + application lines. But don't lose sight of your domain.
For example, you could have modules for actors (e.g. Scientist, Publisher), entity types (e.g. Paper) or processes (e.g. Peer Review), etc.
In fact, I would start with the domain, create some domain models and process flows, figure out a semantically meaningful and intuitive break-up into (domain) modules, gauge architecture impact, and only then finally determine the technical module categories you have
The codebase for the v1 release of ScienceFair was monolithic. The ecosystem of tools planned for the next few releases requires a much more modular approach.
There's a project for managing this process. Issues related to modularisation should be added to the project and tracked through the process there.
There's also a new organisation, sciencefair-land, under which we can collect modules we want to maintain.
why modularise?
There are many good reasons in general for writing small modules (see substack and mafintosh's excellent explanations).
Specifically in the case of ScienceFair, we are building not just an app, but an ecosystem of tools. For example, we will have at least:
v1.1
)v2
)The most efficient way to build and maintain this ecosystem will be to abstract out units of shared logic, configuration, assets or data into standalone modules. These can then be reasoned about, tested and maintained in their own scope, and used across any number of tools.
When we start building the package system for
v2
, having all the parts of the app as atomic as possible will greatly ease the transition to being customisable with packages.process
We have at least these kinds of modularisable units (i.e. things that should be separate npm packages) in the project:
sciencefair-paper
and maintained under the @sciencefair-land orgchoo
-scoped names, e.g.choo-online
and can be owned by anyonechoo
andsciencefair
scoped - e.g.choo-sciencefair-paper-model
choo
views which can be broken out intonanocomponent
s - see #47choo
andsciencefair
scoped - e.g.choo-sciencefair-paper-view
tasks