Supporting multiple board/build variants reusing the same configuration (board dir)

ev3dev / brickstrap

Tool for bootstrapping Debian and creating bootable image files for embedded systems

MIT License

35 stars 26 forks source link

Supporting multiple board/build variants reusing the same configuration (board dir) #17

Closed cmacq2 closed 8 years ago

cmacq2 commented 8 years ago

Splitting of the conversation from: PR #16

So, I like where you are going, but I think we can do better than the environment variables that you have proposed here. What if we made it so that a board configuration directory had more than one config file for different "flavors"? The config files opt in to the package groups and hooks that are needed for that flavor. I think we would need to add an additional set of hooks that are run before the configure-packages to account for small differences in the root files. With this, to make a one-off, now you only have to copy a config file instead of the whole board configuration directory. What do you think?

How does the opt-in mechanism work, exactly?

I think the risk here is that it quickly becomes overly-complicated to use by trying too hard for a 'neat' suggestion. And hard to implement besides: selectively opting in requires somewhat interactive two-way communication between configuration & brickstrap itself (for which there's currently no support other than fiddling with internal brickstrap state which is a dangerous proposition and should be discouraged).

Fortunately I think we can do something conceptually simple that sufficiently flexible and doesn't require as much from brickstrap.sh itself. We do this using convention-over-configuration approach, but before I get to that, briefly an excursion:

Conceptually there are always 4 categories or types of configuration:

General, generic rootfs configuration (generic package selection, hooks, rootfs skeleton)
Arch specific rootfs configuration (like kernel version, bootloader).
Board specific configuration (like custom DTBs, firmware)
Variant builds. I.e. "minimal", to "full fat" builds of what is otherwise the same in layers 1-3. So essentially just extending/blacklisting packages.

Now it's not important specifically where the boundaries of each category are, just that intuitively once you start sharing a board directory for different device & arch combinations it becomes apparent to the user what the most optimal partitioning of their configuration files would be. The user knows very well what is just a simple "variant" extension, what is board specific, what is arch specific, etc. to their build set up: precisely because they put it all together themselves!

Note that we should not require the user to take full advantage or to obsess over getting their partitioning right: they can simply opt-in to the scheme and use it as and when convenient.

So instead of trying to mediate between configuration, we offer a declarative convention and rely on the user to set up their partitioning right: we introduce the concept of overriding paths, and assign prefixes to the 4 types (which can be stacked as follows, bottom-to-top of the stack):

The generic re-usable configuration goes in $BOARDDIR i.e. it lives in the board directory directly. This is to preserve backwards compatibility with existing setups as much as possible.
The arch specific bits go in $BOARDDIR/arch/$arch/, where $arch is the right arch. Essentially the arch is stacked on top of the generic config. Note that $arch may or may not correspond to a valid architecture for Debian archives, if it doesn't the user simply has to set the right ARCH variable in their config file (e.g. $BOARDDIR/arch/something-weird-I-did/config). Also note that an empty $arch is valid, basically in that case there is no arch specific layer to the user's configuration.
The board specific bits are stacked on top of this using $BOARD_PREFIX/board/$board. In this case $BOARD_PREFIX is either $BOARDDIR/arch/$arch or if $arch is empty then it is just $BOARDDIR. Again, $board is an arbitrary value like $arch, and again it may be empty (in which case there is no board specific layer).
Similarly the variant specific bits are stacked on top of that using $VARIANT_PREFIX/variant/$variant where $VARIANT_PREFIX takes the board also into account. Again $variant is arbitrary and may be empty.

That is: the top of the stack is the variant directory, and the bottom is the generic configuration directory. This convention permits the user to selectively override files simply by placing an alternative in something higher up the stack. This is generally a simple matter of copy-paste & then modify the contents of the overriding file.

Whenever brickstrap.sh then selects a configuration file all it has to do is consider this directory convention scheme and honour overrides properly.

There are three partial exceptions to the general override mechanism:

The raw rootfs files (as used by copy-root stage) and
Package files.
Hooks.

For those three an additive scheme should be used: that is each layer of the stack adds to the next layer of the stack, but files with the same basename override others lower down the stack.

Then if you also pull in PR #16 you can also override package selection in your config file by setting blacklists, permitting you to exclude specific packages on an arch, board and/or variant specific case-by-case basis.

Finally, how does brickstrap.sh know what arch, board and variant to pick? Easy: we introduce additional command line switches to pass these in as parameters. (E.g. -A for architecture, -V for variant, and -B for board.)

Regarding non-32-bit-arm architectures, I think we just use the existing ARCH variable from the config files to pick the correct qem-user-static (or none at all). But you should open a separate issue for that.

Yes. I would go with the following refinement of this basic idea, though:

Permit the user to set the correct QEMU directly in their config file.
If the config file doesn't specify ARCH set ARCH to the value learned from -A if it is available.
ARCH is still empty, assume a native build. Possibly recognise the special value "native" to assume a native build also.
If the user hasn't specified the right QEMU manually, use the value of ARCH to pick a suitable default.
If it's a native build set ARCH to the host architecture.

This way the user can also use QEMU which they built themselves, or if no official Debian is available for the arch yet. Useful when chasing QEMU bugs like missing syscalls, or if you are trying to port Debian to something new/obscure ...

dlech commented 8 years ago

Just some random thoughts...

Conceptually there are always 4 categories or types of configuration:

General, generic rootfs configuration (generic package selection, hooks, rootfs skeleton)

So I would combine the three existing *-ev3dev-jessie directories into a single ev3dev-jessie directory. There could also be a generic jessie directory for those that don't want ev3dev stuff an in the future, there might be a ev3dev-stretch directory.

Arch specific rootfs configuration (like kernel version, bootloader).

At this point in time, I consider all of this stuff board-specific. Also, currently, all boards have an external bootloader, so I haven't had to deal with this yet. In the case of the BeagleBone Black, it will need a boot loader, but it gets installed as part of the create-image step because it gets dded to a certain address rather than just copying a file.

Board specific configuration (like custom DTBs, firmware)

Currently, all of this stuff comes from debian packages.

Variant builds. I.e. "minimal", to "full fat" builds of what is otherwise the same in layers 1-3. So essentially just extending/blacklisting packages.

The use cases I have so far for this would be something like ev3dev+mono or ev3dev+ROS where additional debian package repositories would be added and extra packages installed and possibly extra hooks.

dlech commented 8 years ago

The generic re-usable configuration goes in $BOARDDIR i.e. it lives in the board directory directly. This is to preserve backwards compatibility with existing setups as much as possible.

I'm not worried about preserving backwards compatibility. As far as I know, I am the only serious consumer of brickstrap. I would rather call this layer "distro". As mentioned above, I envision a generic jessie and an ev3dev-jessie at this level.

The arch specific bits go in $BOARDDIR/arch/$arch/, where $arch is the right arch. Essentially the arch is stacked on top of the generic config. Note that $arch may or may not correspond to a valid architecture for Debian archives, if it doesn't the user simply has to set the right ARCH variable in their config file (e.g. $BOARDDIR/arch/something-weird-I-did/config). Also note that an empty $arch is valid, basically in that case there is no arch specific layer to the user's configuration.

If this doesn't correspond to a valid debian arch, then I don't want to call it "arch". Also, I'm not seeing how this is different from the board layer.

The board specific bits are stacked on top of this using $BOARD_PREFIX/board/$board. In this case $BOARD_PREFIX is either $BOARDDIR/arch/$arch or if $arch is empty then it is just $BOARDDIR. Again, $board is an arbitrary value like $arch, and again it may be empty (in which case there is no board specific layer).

Makes sense.

Similarly the variant specific bits are stacked on top of that using $VARIANT_PREFIX/variant/$variant where $VARIANT_PREFIX takes the board also into account. Again $variant is arbitrary and may be empty.

OK, so it would included both $distro/variant/$variant and $distro/board/$board/variant/$variant. That works.

For those three an additive scheme should be used: that is each layer of the stack adds to the next layer of the stack, but files with the same basename override others lower down the stack.

I was thinking that I would use board-specific hooks to modify/replace/delete files farther down the stack if needed. There are a number of existing cases where I would want to change just one line of a file. If we were doing the basename thing, I would have to maintain multiple nearly identical copies of the files.

dlech commented 8 years ago

QEMU stuff should really be a separate issue.

cmacq2 commented 8 years ago

Just some random thoughts...

So I would combine the three existing *-ev3dev-jessie directories into a single ev3dev-jessie directory. There could also be a generic jessie directory for those that don't want ev3dev stuff an in the future, there might be a ev3dev-stretch directory.

For example.

You could go a step further: since the difference between stretch and jessie is just the archive name (in Debian terminology) you could set up a custom config for both to set up a custom SUITE variable and reference the SUITE variable from multistrap.conf.

At this point in time, I consider all of this stuff board-specific. Also, currently, all boards have an external bootloader, so I haven't had to deal with this yet. In the case of the BeagleBone Black, it will need a boot loader, but it gets installed as part of the create-image step because it gets dded to a certain address rather than just copying a file.

Sure. However, the point of the exercise is that you get flexibility at fairly minimal cost in complexity and to permit multiple approaches to parametrising the configuration. For some people the target board is all the information that matters, really. For others they might just want to select between a couple of variants to deal with a few optional extras. Or perhaps, it's all about the shared components because they are building a cross-platform product and the board is merely an implementation detail to worry about.

Judging by what you wrote, it seems to me as though you tend to approach the job of building a configuration as "I've got this board, how do I set it up with a nice OS so I can get on with my project quickly?". But you could also look at it the other way round: "I've got this OS (product/project), how do I generate a build for a variety of devices easily so other people can use it right away?"

Currently, all of this stuff comes from debian packages.

Yes, but which packages? Just to point out the obvious: sometimes you want to select an architecture specific kernel, sometimes you want a board specific one (like the ev3dev kernel for the Raspberry Pi which is not necessarily such a good choice for a aarch64 dev board).

So the answer is: set up your package files to split off the arch/board specific files into the right layer in the stack and you can keep the rest fully re-usable.

The use cases I have so far for this would be something like ev3dev+mono or ev3dev+ROS where additional debian package repositories would be added and extra packages installed and possibly extra hooks.

Pretty much. Or to build things with extra media codecs and so on.

dlech commented 8 years ago

since the difference between stretch and jessie is just the archive name

What I found from working with the wheezy to jessie transition is that this is not the case. There were many nuances that required different handling. The big one of course was systemd as init. Also, ssh introduced a new key type that had to be handled in the host keys. And so on...

Right now, stretch may be still similar enough to jessie that you can get away with a SUITE variable, but I expect that over time this will no longer be the case.

dlech commented 8 years ago

Judging by what you wrote, it seems to me as though you tend to approach the job of building a configuration as "I've got this board, how do I set it up with a nice OS so I can get on with my project quickly?".

I assume this is the use case for most non-ev3dev people wanting to use brickstrap.

But you could also look at it the other way round: "I've got this OS (product/project), how do I generate a build for a variety of devices easily so other people can use it right away?"

This is the (only) use case that I am interested in for ev3dev.

cmacq2 commented 8 years ago

This is the (only) use case that I am interested in for ev3dev.

Well that's fair enough. My point, though, is that I think the same mechanism works well for either approach (and probably more besides) at virtual no additional cost in complexity within brickstrap. The cost is all up-front in supporting such a virtual path/prefix mechanism in the first place, once you've got that then adding more prefixes is almost trivial because they are defined over previous paths by induction.

So it's easy to add more, and with that in mind I think going for the full 4-way scheme right away is the smart thing to do: remember you can always not use a particular variable and leave it empty/undefined if you don't want/need to use the corresponding layer.

In particular, I think the $arch is really rather useful to have.

What I found from working with the wheezy to jessie transition is that this is not the case. There were many nuances that required different handling.

So in this particular case I'm probably wrong. I've been running Debian testing/unstable so I've been on systemd since before Jessie and crucially I got the upgrade piecemeal so never really noticed much of a breakage. And no doubt such 'breakage' will occur in the future at some point as well.

cmacq2 commented 8 years ago

If this doesn't correspond to a valid debian arch, then I don't want to call it "arch". Also, I'm not seeing how this is different from the board layer.

I think I didn't express myself clearly enough here. While it definitely should correspond to a valid Debian arch, brickstrap shouldn't try to second guess or to impose a fixed set of values here when it doesn't, however.

After all, brickstrap.sh can work with arbitrary values so in principle it's not an error if the name is arbitrary, as long as it's picked deliberately by the developer of the configuration. This is kind of useful for setups that pull from private Debian repositories in which packages may be compiled differently so shouldn't be labeled as something they aren't. It would also be useful for people porting Debian to obscure architectures, for whatever reason.

Note that if you accept my ideas on the QEMU issue, there's no problem here in allowing any value: in such cases, the config can either patch up ARCH (like it works now) or the user is restricted to their own custom repositories which understand the value properly.

So because of that I think there's not much point in imposing an arbitrary limit on this particular functionality simply because we're opinionated. That's what documentation and guides are for, to educate people to promote the right and proper thing of sticking with actual Debian arch names. ;)

cmacq2 commented 8 years ago

OK, so it would included both $distro/variant/$variant and $distro/board/$board/variant/$variant. That works.

What I meant was that whether it would look in $distro/variant/$variant or $distro/board/$board/variant/$variant would depend on whether or not a $board layer was in use.

So not both within the same configuration. Do you want to use both at the same time?

My reasoning is that you'd only ever need one 'variant' layer anyway because as developer you know how you want to partition your configuration and as an end-user of a configuration that's what documentation is for. By which I mean, it would be part of the build instructions with an example command or two, wouldn't it?

cmacq2 commented 8 years ago

Essentially, path lookup works rather like scoped variable lookup in an innermost scope.

You've got the global scope: $BOARDDIR/distro or just $BOARDDIR
You've got the arch scope with arch/$arch/, which is considered nested within the global scope.
You've got the board/device scope with board/$board/, which is considered nested within the arch scope.
You've got the variant scope with variant/$variant/, which is considered nested within the board scope.

The prefix mechanism is just a way to make this easy to implement/express in file paths.

cmacq2 commented 8 years ago

I was thinking that I would use board-specific hooks to modify/replace/delete files farther down the stack if needed. There are a number of existing cases where I would want to change just one line of a file. If we were doing the basename thing, I would have to maintain multiple nearly identical copies of the files.

The basename thing is quite necessary for making package files work nicely and to permit overriding hooks completely. In that case you definitely want a "more specific" package name to override a less specific one... but you have a problem: discovering which files are supposed to be there. That's currently solved by globbing, but you probably want to inherit package files accross layers. So you need to repeat the globbing, but then you might get package names that were previously overridden already.

... As a result you need to track which package files you've already accepted and work your way down from most-to-least specific through the layer but filtering files you've already accepted in the for loop.

You shouldn't wind up with lot's of almost identical files that each require maintenance. Splitting package files properly takes care of most of it (so you properly partition packages between "more globally reusable" and "specific to this scope") and once you only want to blacklist a few packages... that's what PR #16 takes care of (just set up your config right).

... Much the same with a files in $BOARDDIR/root, I think you want to inherit the files but still have an 'override' mechanism at work. Note that if you wish to alter the files you can still do so from inside a hook under this scheme (and simply not override the file from its base version in "distro" if you wish to avoid maintaining multiple copies). .

... And even with hooks, I think it's more useful to do it this way. Because they are bash, you can always set up a parallel directory tree underneath $BOARDDIR in which you define the actual implementation of the hook as a callable function inside a file. Then from the hook you source e.g. $BOARDDIR/hook-lib/my_hook_fn.sh and call my_hook_fn with appropriate arguments.

So you get retain reusability, can inherit hooks, and even override them.

dlech commented 8 years ago

While it definitely should correspond to a valid Debian arch, brickstrap shouldn't try to second guess or to impose a fixed set of values here when it doesn't, however.

At first, I couldn't see why "arch" should be separate from "board", but now I am beginning to see your point. To use RPi2 as an example, this can use debian/armel (armv4 soft float), raspbian/armhf (armv6 + VFP) or debian/armhf (armv6 +VFP). (I actually make a similar hack to pbuilder-dist so that ARCH = rpi means raspbian/armhf.)

So, I would only have one rpi board definition and a separate arch for RPi1 and RPi2. I think that could work.

dlech commented 8 years ago

What I meant was that whether it would look in $distro/variant/$variant or $distro/board/$board/variant/$variant would depend on whether or not a $board layer was in use.

So not both within the same configuration. Do you want to use both at the same time?

If I'm understanding this right, I could, for example, use a variant to add extra packages, possibly from an arch dependent ppa. I can see it happening that I may want to include one package in a variant that is not arch dependent (and would therefore make sense to have at $distro/variant/$variant) and another package that is arch dependent (at $distor/arch/$arch/variant/$variant).

dlech commented 8 years ago

Essentially, path lookup works rather like scoped variable lookup in an innermost scope.

You've got the global scope: $BOARDDIR/distro or just $BOARDDIR
You've got the arch scope with arch/$arch/, which is considered nested within the global scope.
You've got the board/device scope with board/$board/, which is considered nested within the arch scope.
You've got the variant scope with variant/$variant/, which is considered nested within the board scope.

The problem I see with this is that boards can have multiple archs, as in the example I gave earlier. So if board is nested in arch, I still have to maintain 2 identical board directories for RPi and RPi2 because they have different arch.

Also, I'm envisioning (and have a real use case for) variants that apply to all arches and boards within a distro, so having it nested under board means that I would have to have (nearly) identical variant directories under each board directory.

cmacq2 commented 8 years ago

If I'm understanding this right, I could, for example, use a variant to add extra packages, possibly from an arch dependent ppa.

I envisaged $variant as a fully independent layer. So basically just pull in packages which you know work for (substantively) and just blacklist those that don't in your config.

Or vice versa, populate PACKAGES with a whitelist in config if the balance of effort tips the other way.

It may be worthwile to introduce a packages.blacklist file for this purpose so that even blacklists can be inherited/overriden like any other file without touching config directly.

and another package that is arch dependent (at $distor/arch/$arch/variant/$variant).

I noticed your other reply about the path lookup. Let's take a step back and rethink...

cmacq2 commented 8 years ago

So the alternative is to ditch the prefix thing and do a proper search path mechanism. In that case I think you would want something like this (in the most maximal version with all variables considered):

$BOARDDIR/distro
$BOARDDIR/arch/$arch
$BOARDDIR/board/$board
- $BOARDDIR/board/$board/arch/$arch
$BOARDDIR/variant/$variant
- $BOARDDIR/board/$variant/arch/$arch
- $BOARDDIR/board/$variant/board/$board
- $BOARDDIR/board/$variant/board/$board/arch/$arch

Essentially distro remains the set of reusable components. Reusable components that are opted-in by variants live in $BOARDDIR/variant/$variant.

Then there's arch. Arch specific components that are opted in by variants live in $BOARDDIR/variant/$variant/arch/$arch. Board specific ones go in $BOARDDIR/board/$board/arch/$arch instead.

Similar mechanism at work for board and opting in based on variants or arch. Finally there is the mechanism for opting in via variants if both board and arch match a specific criterion: $BOARDDIR/board/$variant/board/$board/arch/$arch

That's the 'edgiest' of edge cases, but there for the sake of completeness.

I'm assuming that grouping by 'variant' would usually be done through some kind of logic which would imply that there's also a sense of a "logical relationship" between the generic, arch-, board- and board+arch specific versions of the same variant. So that's why I chose to keep those grouped underneath the same basic $BOARDDIR/variant/$variant tree instead of splitting them over various disjoint folder hierarchies -- I'm guessing this will help reduce the cognitive burden in navigating around. Otherwise it's maybe slightly more difficult when directories are all named the same ($variant) and you'd need to keep looking at the full path to disambiguate.

dlech commented 8 years ago

This looks good other than I expected distro to replace $BOARDDIR so that distro would have all other components underneath it. For example, ev3dev has a custom flash-kernel and kernel packages for RPi. However, a generic distro for RPi should probably be setup like the official Raspbian and mount the FAT partition to /boot and use raspi-update to install the kernel and things like that.

cmacq2 commented 8 years ago

Huh? I'm not sure I follow. Or rather I think that extending this sharing mechanism beyond the $BOARDDIR boundary has the real potential to bite an unwary user somewhere painful.

I think the main benefit of $BOARDDIR is that it confines "responsibility" and not just code. Basically if you are an end-user you know who to complain to if anything about a specific $BOARDDIR is broken. This matters because it also means breakage makes "sense" rather than appearing to be "completely random". It's a lot easier to debug your own code than someone else's, etc. etc.

Also, if you are putting together a custom configuration on top of stock brickstrap (i.e. your $BOARDDIR isn't actually shipped as part of brickstrap) then you probably don't want this either because brickstrap makes no "guarantees" about how the distro is setup. It might change considerably, including removing or adding arbitrary packages.

Essentially once this scheme is in place, I think you could say $BOARDDIR is a bit of a misnomer and $PROJECT would be a better descriptive name for it.

... What we could do, instead, is permit the user to place a distro file in $BOARDDIR which permits you to pass an alternative tree to override the default prefixes ($BOARDDIR/distro, $BOARDDIR/board, $BOARDDIR/arch, $BOARDDIR/variant, perhaps even $BOARDDIR itself). The path(s) would be interpreted relative to the distro file itself. Alternatively you could of course set up symlinks (though that's less nice when using git, you'd have to set them up after checkout and set up .gitignore).

Either way, that is an explicit opt-in mechanism meaning users can opt-in or opt for greater promise of stability/control based as an informed trade-off.

dlech commented 8 years ago

Yes, we definitely need to rename $BOARDDIR. I think $PROJECT is a good choice.

I was thinking that distro would be synonymous with project, but the way you are suggesting makes sense too.

So, for the existing "boards" I have in brickstrap plus the planned BeagleBone Black, I would have PROJECT=ev3dev, DIST=jessie ARCH={armel,raspi-armhf,armhf} BOARD={ev3,rpi,bbb}. The file tree would look something like this...

projects
+ ev3dev
| + arch
| | + armel
| | + armhf
| | + raspi-armhf
| + board
| | + bbb
| | + ev3
| | + rpi
| |   + arch
| |     + armhf
| |     + raspi-armhf
| + distro
| | + jessie
| + variant
|   + mono
|   + ros
+ generic
  + arch
  | + arm64
  | + armel
  | + armhf
  + board
  |  + bbb
  |  + rpi
  + distro
  |  + jessie
  |  + stretch
  + variant
    + minimal
    + desktop

cmacq2 commented 8 years ago

Ah, you want a $DISTRO so instead of having 'distro' being the reusable stuff you want to select what the reusable stuff is by passing an additional -D option (for distro) on the brickstrap commandline?

The preference for putting generic reusable components underneath a "distro" tree (which I misttook for a literal, fixed, "distro" name) was striking me as somewhat arbitrary but this makes more sense.

Question: do you think we should also add a distro config file so compatible projects could inherit trees from other projects? Possibly not call it distro but call it prefixes instead (to avoid confusion with the distro/ hierarchy itself) ?

dlech commented 8 years ago

I would rather keep projects totally autonomous. What fixes something for one project could break another project. This is why I included the generic project. It would include this sort of common stuff and serve as a template for new projects.

cmacq2 commented 8 years ago

Okay. Just checking: my last comment is on the money as far as "distro" is concerned? I.e. it's actually a case of distro/$DISTRO, presumably with eventual matching commandline switch to match (e.g. -D)?

... Because that would mean I could start work on a tentative implementation of this scheme :)

dlech commented 8 years ago

Yes, that is how I envision distro.

Looking forward to the pull request(s) (hopefully in bite-sized pieces :smiley:).

dlech commented 8 years ago

OK. I'm done with my release stuff, so you can now make changes without worrying about breaking me.

dlech commented 8 years ago

After having seen the preliminary nested directories implementation in #19, it scares me. It seems too complex. I would like to propose a flatter version of the same idea based on how debhelper for debian packaging works.

First, for context, we currently have the following special files and directories:

hooks/
packages/
root/
config
custom-report.sh
debconfseed.txt
multistrap.conf
preinst.blacklist
tar-exclude

And we have the current -b option for "board definition".

This is split into two options -P for project and -B for board. I'm purposely leaving out the other proposed options for now to keep things simple.

Only one -P argument is allowed, but multiple -B arguments are allowed or we could have a -F option for "board family". I am going to turn my three *-ev3dev-jessie board definitions into one ev3dev-jessie definition.

The idea is that we have additional files and directories at the same level with suffixes added. The suffixes match the command line arguments. Any matching files are mixed in. Suffixes follow the pattern .<option>:<value>, so we had -B rpi, then the suffix would be .B:rpi.

So, I did a 3-way diff on my existing board and came up with the following project definition for ev3dev-jessie:

hooks/
hooks.B:rpi1/
hooks.B:rpi2/
hooks.F:ev3/
hooks.F:rpi/
packages/
packages.B:rpi1/
packages.B:rpi2/
packages.F:ev3/
packages.F:rpi/
root/
root.B:rpi1/
root.B:rpi2/
root.F:ev3/
root.F:rpi/
config
config.B:rpi1
config.B:rpi2
config.F:ev3
config.F:rpi
custom-report.sh
debconfseed.txt
multistrap.conf
preinst.blacklist
preinst.blacklist.F:ev3
tar-exclude

Although I only used one suffix per file/directory in my example, suffixes could be combined, e.g. config.F:rpi.B:rpi2.

I know I've left out many of the proposed features, but let's just start with this and see how it works, then add more as we need it.

cmacq2 commented 8 years ago

This is split into two options -P for project and -B for board. I'm purposely leaving out the other proposed options for now to keep things simple.

Only one -P argument is allowed, but multiple -B arguments are allowed or we could have a -F option for "board family".

... I think this is somewhat ominous: we start 'simple' but one paragraph later we suddenly have need for a 3rd option. Still on the face of it, at this point it seems less complex than one project, with four optional variables.

My question at this point is: what does "multiple -B arguments are allowed" mean? Is it about supporting multiple 'boards' in the same tree, is it about building for multiple boards at the same time (in the same brickstrap invocation), both, or neither?

The idea is that we have additional files and directories at the same level with suffixes added. The suffixes match the command line arguments.

Uhh, the existing proposal works almost exactly like this except that instead of suffixes you get prefixes in directory paths. The existing proposal has benefits:

You can selectively override files depending on your variables, instead of just adding things. That can be quite useful: you can define a single package file name that is responsible for pulling in things like "the right kernel" and then override its contents where needed with updated package names. This is not 'added' complexity, but kind of the consequence of the prefix based scheme. It's easier to branch out as you walk/discover more specific details in the directory tree, than it is to retrospectively see what is more specific.
You do not arbitrarily introduce syntax where none is needed. That's one less thing to get wrong.
The rules are no more complex, but the scheme works recursively which permits chaining of arbitrary variables in a well-defined order. There's no ambiguity what each path means, or whether a path would override others (and if so, which ones). It's correspondingly easy/trivial to build introspection tools that can rely on this fact. (See the brp_print examples.)
Moreover, it's correspondingly easy to check that the values passed for various options are actually valid -- even in edge cases (See the brp_validate items).

It's not entirely clear the alternative scheme works as robustly:

Suffixes follow the pattern .

Subjectively I'm not a fan of using the colon for this, the colon is in use for combining paths already. It seems like a case of unintended consequences waiting to happen. Try dashes, maybe?

suffixes could be combined, e.g. config.F:rpi.B:rpi2.

Objectively this seems actually inferior to using the prefixes because it is not nearly as easy (or even possible) to properly validate command line arguments now (someone types in rpi.B and you're kind of up a creek without a paddle and inside a leaky vessel). Moreover can the chaining order be arbitrary, and what once you realise you do need more variables?

dlech commented 8 years ago

My question at this point is: what does "multiple -B arguments are allowed" mean?

If there was not a -F option for "board family", then I might have multiple "boards" that I want to mix together, like -B rpi-common -B rpi2.

Uhh, the existing proposal works almost exactly like this except that instead of suffixes you get prefixes based directory paths.

Right. But the advantage I see of not using nested directories is that you can see all of the options at once in a single directory. When I am working on a board definition, I am using a text editor, not a terminal, so I don't want to have to expand 4 or 6 directories or switch to a terminal and run br_print to find things.

Subjectively I'm not a fan of using the colon for this

I'm not either. Dash might be ok. Maybe @.

you can define a single package file name that is responsible for pulling in things like "the right kernel" and then override its contents where needed with updated package names.

If I did this, I would just omit the kernel package file from packages/ (because it would be empty anyway) and include it in packages.B:ev3/, packages.B:rpi1/, packages.B:rpi2/.

Moreover can the chaining order be arbitrary, and what once you realise you do need more variables?

I'm OK with the arbitrary order because I don't need to shadow any files. And when we need another variable (like -V for variant), then we just add it.

I really appreciate all the thought and effort that you are putting into this. But as a maintainer and user of brickstrap, you are making lots of work for me by making such extensive changes all at once (even if they are useful changes). I'm not really sure how to proceed from here. I don't want to run you off because I am grateful for your help, but at the same time, I am a bit overwhelmed trying to keep up with you.

If you want to do the work of finishing the implementation of what you have started and convert all of the ev3dev-jessie stuff for me and show me how awesome it is, then great. But if you want more help from me, then I am going to have to insist that we move a bit slower and just focus on one thing at a time. For example, we can just focus on the QEMU PR or just start with a PR that renames $BOARDDIR to $PROJECTDIR and the -b option to -P.

cmacq2 commented 8 years ago

I am a bit overwhelmed trying to keep up with you.

I understand. I can see how it must all add up to "a bit much". :)

After thinking about your posts more, I think what you might really want is just this:

A project
Components

The user opts in to a list of components using a command line switch (say, -c), and each component is simply 'added'. The switch might be repeated to include multiple components.

Then the developer of the project is entirely responsible for making sure there are no conflicting things between the base project and optional components; and also that any order of adding components is idempotent.

The directory layout for this would look like:

$PROJECT/core (core being a reserved name, this component is always included implicitly)
$PROJECT/$COMPONENT

Example:

evdev3/core
evdev3/rpi
evdev3/rpi2
evdev3/ev3

The hierarchy of each subdirectory is, in principle, identical. Files like config and tar.exclude or tar.include are cat'ed together, things like root/ are simply copied to the target in-order of appearance.

That might be even simpler, in that the number of variables is now fixed inside brickstrap itself (just the one) but the developer can add arbitrary many components as he sees fit and there's no need to 'understand' anything.

The conceptual 'leap' here is that instead of considering 'boards' or 'architectures' or 'variants', it's all just components and there are only components. The invocation of brickstrap commandline might feature a fair few -c options for a reasonably complex config, but in theory all concerns can be abstracted over using the component interface.

... Thoughts?

As for what to prioritise: ultimately this discussion is about a nice-to-have. Brickstrap is perfectly serviceable without any such scheme at all (just a bit more work to maintain lots of projects/configs). However, the QEMU stuff touches on a real limitation of brickstrap so if we can't work on both frontiers at the same time then let's focus on removing the more constraining limitation first, i.e. prioritise supporting arbitrary QEMU or no QEMU at all.

dlech commented 8 years ago

Thoughts?

Simple, yet elegant (and not overwhelming). :wink:

I will expand the example to be...

ev3dev-jessie/common
ev3dev-jessie/ev3
ev3dev-jessie/rpi1
ev3dev-jessie/rpi2
ev3dev-jessie/rpi-common

Then for the 3 image files, I would just run:

brickstrap -p ev3dev-jessie -c common -c ev3
brickstrap -p ev3dev-jessie -c common -c rpi-common -c rpi1
brickstrap -p ev3dev-jessie -c common -c rpi-common -c rpi2

I like it.

QEMU

OK, I'll have a look at that PR tomorrow.

cmacq2 commented 8 years ago

Okay. I'll close PR #19 for now.

cmacq2 commented 8 years ago

Time to close this issue: the component based path mechanism has landed and been fully integrated now.