Multiple Git Repositories

zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.

https://docs.zephyrproject.org

Apache License 2.0

10.85k stars 6.61k forks source link

Multiple Git Repositories #6770

Closed carlescufi closed 5 years ago

carlescufi commented 6 years ago

Important: The west multi-repo model is discussed and tracked in this document

This issue covers splitting the current zephyr Git repository into multiple ones, and having a tool to manage multiple repositories and contribute to them.

Note: We have edited this issue's description in order to reflect the outcome of all the discussions and work that has taken place since the issue was first raised.

Motivation

Zephyr should avoid mixing external code with original code for the following reasons:

Repo size (-)
License, IP restrictions (---)
Unused code that might contaminate actual use code (--)
Marketing reasons, PR, perceptions (-)
Customer A might not be interested in anything that is Vendor A related (-)
Simplify development outside of the Zephyr tree, including applications and libraries (--)
Provide integration with external projects such as MCUboot (-)

We have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today. It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required.

Requirements

Ability to retrieve all required repositories from the command-line
Ability to place repositories outside of the main zephyr tree
Ability to maintain a manifest with pinned revisions or tracking branches
Ability to retrieve only a subset of repositories
Ability to remove or replace a subset of repositories without modifying the upstream zephyr tree
Ability to work on Linux, macOS and Windows
Ability to bisect the main upstream zephyr tree carrying along exact revisions of the projects during the bisection. This implies tracking projects using exact SHAs upstream.
Ability to manage (create, delete, push, pull, rebase, etc.) project branches/revisions in diverse situations: locally, in collaboration with other developers, for upstream review
Ability to query global status information about local projects (branches, commits, diffs)
Ability to manage the current projects in the manifest (list project names and their paths, run commands in multiple projects)
Ability to manage the manifest as a first class entity (pin revisions, semantic diff between versions, distribute versions to others without affecting any other component in the system)
Ability for contributors to locally reproduce any builds performed by upstream CI without having to modify the vanilla manifest file.

Conclusion

We will use west, a Zephyr meta-tool to manage multiple repositories, using a manifest to define the set of repositories, their revisions and other metadata.

FAQ

Why a single tool?

It has been argued that different functionality (repository management, flashing, debugging, etc) belongs in different tools instead of trying to come up with swiss-army knife that does it all. While the argument has weight and value, after careful consideration we have decided to provide a single entry point to the west functionality in order to simplify the user experience. That does not mean that all of the code needs to be in a single place, and in fact west uses an extension mechanism that allows us to place the implementation of different west commands in separate repositories, including the build, flash and debug commands in the zephyr repository where they live now (see scripts/west-commands.yml). There are many examples of tools similar in scope to west:

Go's go command-line tool
Mynewt's newt command-line tool
Mbed's mbed-cli command-line tool
CMake's ability to combine fetching external projects with building

Why is it called `west`?

See here.

Why Python?

Because it's cross-platform, many of our users already know it and most important of all, it is already a dependency for Zephyr.

Why not use Google's repo

It is Python 2 only
It requires an administrative command prompt on Windows due to its use of symlinks
Its code review system is hardcoded to gerrit
It is poorly documented and maintained ad-hoc
It is not suited with the zephyr upstream multi-repo model of a central repository (zephyr) with all of the core code and the manifest itself

Why not use Git submodules?

There would be two possible ways of using submodules with Zephyr:

Add submodules to the main zephyr repository. This would not meet some of the requirements, in particular the ability to retrieve only a subset of repositories, since the paths and existence of those would be hardcoded.
Create a new "meta" repository which only contains submodules to other repos. This would be equivalent to a "manifest" repository. This option would satisfy most of the requirements, but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

Neither would really fully cover all of the requirements described in the Requirements section

Additionally, the conclusion to use a meta-tool for multiple uses makes submodules less of a good fit. Finally, using a meta-tool should not preclude users from still using submodules if they prefer to do so.

Unresolved issues

Unresolved issues before we can split the main repository into multiple ones:

PRs across multiple repos: How to match and retrieve a PR that spans multiple repositories. Possible solutions:
- Use branch names: Same branch name across all repositories, including manifest repo
- Use an equivalent of a Changeset ID
Upmerging forks (taking upstream into a fork): Forks will use a different manifest, with a different set of repositories. Some will be common, some not. Today one can do: git fetch upstream, git merge upstream/master. Possible solutions:
- west upmerge <repo list> ?

carlescufi commented 6 years ago

Options to achieve this goal:

Fork Google's repo tool and make it work on Windows properly
Take parts or fragments from either repo or gclient inside depot_tools and write our own tool that, in time, can also fulfill the requirements in #6205

Additional tools that achieve similar objectives:

jiri used in Google's Fuchsia

jukkar commented 6 years ago

Perhaps this is discussed already in other forums but why do we need to have multiple git repositories in first place?

mbolivar commented 6 years ago

Perhaps this is discussed already in other forums but why do we need to have multiple git repositories in first place?

In my view, Zephyr already has multiple Git repositories. Examples:

https://github.com/zephyrproject-rtos/zephyr https://github.com/zephyrproject-rtos/Kconfiglib https://github.com/zephyrproject-rtos/net-tools

At least zephyr and net-tools are already required to use many networking samples in Zephyr in important cases.

Just as Zephyr's networking subsystem already benefits from multiple repositories, why would other areas not also potentially find this useful?

jukkar commented 6 years ago

Just as Zephyr's networking subsystem already benefits from multiple repositories, why would other areas not also potentially find this useful?

I am not questioning this issue. I was just wondering the reasoning because the issue started to talk about requirements but was not describing the "why" part.

nashif commented 6 years ago

@jukkar good point. Updated with the "why" part. We had this documented somewhere else.

locomuco commented 6 years ago

what also could be considered:

e.g. at the moment zephr master is working with net-tools master, but there is no pinning at the moment to a specific version, like it would be with git submodules

carlescufi commented 6 years ago

@locomuco definitely. net-tools can (and probably will) be part of the default manifest

carlescufi commented 6 years ago

Relevant PR: https://github.com/zephyrproject-rtos/zephyr/pull/7338

ulfalizer commented 6 years ago

@carlescufi @SebastianBoe @mbolivar Working on multiple repository support in West at the moment.

Random brain dump below:

In some previous discussion, people (can't remember who) said they'd prefer if west sync checked out the repositories on a local branch instead of with a detached HEAD (git-repo uses a detached HEAD, if I understand it right).

One advantage of having a branch checked out is that git status automatically gives sensible output (I'm not a Git expert, so that makes it even nicer). It might be less confusing when working manually on the repositories too.

IIRC, someone also said that rebasing on sync in git-repo is confusing. I'm not sure what the alternative would be there though. Throwing away local changes seems less useful (even if the changes can be recovered).

Having a local branch leads to some tricky design decisions:

If the repository is on some other branch, what should west sync do? Just rebase it on top of the original branch? Switch back to the original branch? Should the original branch be updated as well?
If the user is in a detached HEAD state, what should happen?
What should happen if the repository is in some other weird state, e.g. in the middle of a git rebase? Just bail out?
Probably other stuff I haven't thought of...

Detached HEAD might be simpler to implement and less "magic". No juggling with local branches. I have a prototype working for that (though there's probably a lot of robustness stuff to add). I could try to do the branch thing though and see if it runs into other trickiness.

Another random thing I thought of: Might want to use "branch" instead of "revision" in default.yml, if it's always supposed to be a branch name.

Bit worried that we're reinventing the wheel here too. git-repo is probably mature and stable at this point, with a lot of devs with more Git internals experience having worked on it.

ulfalizer commented 6 years ago

Hmz... maybe the sanest thing if we go for the branch thing would be to always switch over to it and then rebase (git pull --rebase or some equivalent), leaving rebasing of any other branches up to the user. That's simple enough to understand.

Might be able to switch back to the previous location too, with git checkout - (just discovered that one).

carlescufi commented 5 years ago

@tejlmand @nashif @mbolivar (CC @aescolar)

Unresolved issues before we can split the main repository into multiple ones:

See main issue description

aescolar commented 5 years ago

Two comments: about 3. : if the history branch is an artifact of the CI runs, it would not help for branches not run in CI or forks.

Also consider the case with merging from one development branch to another (not necessarily master), and merging from some time in the "past", not necessarily the master HEAD.

pabigot commented 5 years ago

@carlesc asked for my input on this, but I have not delved into west beyond a peripheral awareness that it handles programming devices now and will do more soon. So what follows may be irrelevant.

TL;DR

My top-level requirement for split-repository support would be: When I set my HEAD in zephyr to some commit via git checkout or git reset --hard SHA1 I expect to be able to immediately see what version any external dependencies were at when that commit was at the head of its branch, and I should be able to update my workspace to those dependencies with at most one command.

My experience

I used repo once or twice and found it opaque. Now I do my cross-repository development (primarily for Yocto) using git submodules. There subordinate repositories are registered with the root repository, I can see whether the submodules are synchronized or have local changes with git status, and changing what's selected in submodules is recorded with a commit. In short changes in submodule selection is tracked along with any other changes, which I think means picking option 1c (manifest in zephyr that tracks heads of external repositories).

I don't know how CI works: if it basically merges the PR branch into current master then reproducibility should require no more than a record of the commit used as the basis of the merge (along with the commit on the PR branch). I would not want any history branch to be in the zephyr repository so when I clone I get a bunch of CI-related material that's irrelevant to me.

I've never had to bisect across multiple repos, but I would expect it to just work as long as git submodule update is run at each stage to make sure the submodules are at the commit expected for the parent branch.

For "upmerging forks", if I undestand that correctly, I think zephyr will need to maintain mirrors of any external project so that if a patch is required to make it work with mainline/branched zephyr that can be supported while the upstream maintainers determine what to do with it, and to avoid issues with lost access to the upstream master repository.

Suggestions

For design insight (if the west tooling is not already complete), review the behavior of git submodule.

For traceability of (CI and local) builds look at what Yocto has for build history.

carlescufi commented 5 years ago

@pabigot thanks for your input

My top-level requirement for split-repository support would be: When I set my HEAD in zephyr to some commit via git checkout or git reset --hard SHA1 I expect to be able to immediately see what version any external dependencies were at when that commit was at the head of its branch, and I should be able to update my workspace to those dependencies with at most one command.

This is perfectly doable when you use submodules the way they are intended to be used: by having a main (i.e. zephyr) repository with most of the code and that links to other ancillary ones that are maintained externally (eg. mbedtls). But, as I've now written in the main description of this issue, this wouldn't work with the model we are looking to implement. We are looking to make certain repositories optional, so as to avoid having semiconductor vendor A having to ship with a HAL from semiconductor vendor B. To achieve that with submodules we would need a "meta" repo that only has submodules and no code. At that point we would need a commit to that "meta" repo every time a commit to any of the "sub" repos is made (including the main zephyr repo) so as to track history. This is more or less equivalent to the "history" branch we proposed.

pabigot commented 5 years ago

The specific technical solution isn't the primary concern; it's whether the requirements and use cases driving the design are satisfied. If the solution in west doesn't easily support the capability I described I would probably bypass it and use something else to manage my workspaces, unless west provided some other reward that balanced the pain.

It sounds like to meet some vendor expectations you might want to allow them to publish trees that don't have any other vendors' code or board support. Excluding a HAL tree is one thing, but removing all the core driver implementations from other vendors, currently residing in the same directories, is rather different---and if you leave them available in an unbuildable state that's not optimal either.

tautologyclub commented 5 years ago

To me it sounds like you need a ux friendly wrapper for submodules, not a whole new version control tool. The issues Carles states regarding submodules sounds pretty minor imo. I'd be stoked if someone managed to make an actual non crappy replacement for Google repo and submodules but man, I can't help but think yall are underestimating the scope of this endeavor.

marc-hb commented 5 years ago

One advantage of having a branch checked out is that git status automatically gives sensible output (I'm not a Git expert, so that makes it even nicer). It might be less confusing when working manually on the repositories too.

Yes. Most git commands and workflows assume detached HEADs are for the very short-term, yet repo uses them everywhere by default.

One drawback of detached HEADs is "losing" commits when you forget to create a branch (cause many things start as a "quick hack" and grow from there): https://groups.google.com/d/msg/repo-discuss/LWMcn50RVSs/yjg6ZvLeAQAJ

marc-hb commented 5 years ago

after careful consideration we have decided to provide a single entry point to the west functionality in order to simplify the user experience.

That's quite the shortcut... can you elaborate or point at where you did? For instance did you find any "prior art" merging version control and building under the same front-end and the value such tighter integration brought?

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories.

Mmmm... so no tighter integration after all, just longer commands because they all have to be prefixed with "west"? How does that "simplify the user experience"? Sorry but without some elaboration and justification I really don't get it.

Surfacing and expanding a bit my question buried deep in the middle of PR #11715: Repo is used by a fair number of projects despite its shortcomings. So you are indirectly saying the world could use a multirepo tool better than repo and submodules or at least different. After a couple years of experience using each I could agree. I could also argue that this is because it's a very tough nut to crack.

=> Why/how should [west-]multirepo be specific to the Zephyr codebase and its community limited to the Zephyr community?

You rightfully pointed out that repo is hardcoded to Gerrit which is a serious limitation but that's nothing compared to hardcoding to a specific project...

carlescufi commented 5 years ago

@tautologyclub

Thanks for the comments.

To me it sounds like you need a ux friendly wrapper for submodules, not a whole new version control tool. The issues Carles states regarding submodules sounds pretty minor imo.

It is unclear to me how you would work around the following issues with Git submodules in particular:

We want to be able to place external projects outside of the main zephyr tree
We want distros or downstreams to be able to replace, remove or add external projects without having to modify the zephyr tree
We want the build system to be able to query the tool for the existance of external projects and their location
We want to be able to "free track" a remote branch of an external project

I truly do not see how this can be done with Git Submodules. Pinging @mbolivar and @tejlmand in case they might be able to add something to the discussion.

I'd be stoked if someone managed to make an actual non crappy replacement for Google repo and submodules but man, I can't help but think yall are underestimating the scope of this endeavor.

As I've written several times before, this really is not about NIH. We (the companies and individuals contributing to west) are trying to solve a problem that we all face when using, distributing and modifying Zephyr, as well as making Zephyr FuSa-certifiable. We looked at the existing options and nothing we found met the requirements we had, so we had no choice. Our preference, since the beginning, was to reuse existing frameworks (exactly like we did with CMake, Kconfig or DTC). But in this case we had no choice.

carlescufi commented 5 years ago

@marc-hb

Thanks for your input

after careful consideration we have decided to provide a single entry point to the west functionality in order to simplify the user experience.

That's quite the shortcut... can you elaborate or point at where you did? For instance did you find any "prior art" merging version control and building under the same front-end and the value such tighter integration brought?

Very good point. Yes, we do have prior art that we were at least partially inspired from:

Go's go command-line tool
Mynewt's newt command-line tool
Mbed's mbed-cli command-line tool
CMake's ability to combine fetching external projects with building

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories.

Mmmm... so no tighter integration after all, just longer commands because they all have to be prefixed with "west"? How does that "simplify the user experience"? Sorry but without some elaboration and justification I really don't get it.

The build, flash and debug commands now live in the main zephyr tree. We've made this change (designed by @mbolivar) so that the implementation of those commands is tightly coupled with the main zephyr repository.

I don't think the commands will be longer: west init west update west build west flash west debug

We believe that a one-stop shop tool is the easiest path forward in order to provide a simple user experience for inexperienced users and newcomers.

Surfacing and expanding a bit my question buried deep in the middle of PR #11715: Repo is used by a fair number of projects despite its shortcomings. So you are indirectly saying the world could use a multirepo tool better than repo and submodules or at least different. After a couple years of experience using each I could agree. I could also argue that this is because it's a very tough nut to crack.

I completely agree, that's why we are not trying to crack that nut. Instead we are trying to build something that fits the specific requirements of Zephyr. Zephyr is not a Linux distribution like Android is, Zephyr is not a loosely coupled collection of repos, it is a combination of security-certified kernel with vendor HALs and external libraries designed to produce a single image including both kernel and userspace that is flashed to microcontrollers.

=> Why/how should [west-]multirepo be specific to the Zephyr codebase and its community limited to the Zephyr community?

You rightfully pointed out that repo is hardcoded to Gerrit which is a serious limitation but that's nothing compared to hardcoding to a specific project...

See my response above. We are not trying to write a project-agnostic tool. We are trying to build a tool that solves a set of problems that we are facing with Zephyr, after having found that none of the tools out there fit our use cases. Of course we might be wrong, and this might prove to be a mistake in the long run. If that is the case we will certainly revert course and acknowledge this, but in the meantime we are giving it a try.

tautologyclub commented 5 years ago

@tautologyclub

Thanks for the comments.

To me it sounds like you need a ux friendly wrapper for submodules, not a whole new version control tool. The issues Carles states regarding submodules sounds pretty minor imo.

It is unclear to me how you would work around the following issues with Git submodules in particular:

We want to be able to place external projects outside of the main zephyr tree

We want distros or downstreams to be able to replace, remove or add external projects without having to modify the zephyr tree

We want the build system to be able to query the tool for the existance of external projects and their location

We want to be able to "free track" a remote branch of an external project

I truly do not see how this can be done with Git Submodules. Pinging @mbolivar and @tejlmand in case they might be able to add something to the discussion.

You could either make a new project where the Zephyr kernel itself is a submodule (would be fine imo), or you could interpret "wrapper around submodules" a bit more liberally to implement some additional features -- or rather, have it present some well-defined interface (such as a specific directory structure) that the build system can infer stuff from reliably. Either way it seems to me that submodules would be pretty well suited to handle the bulk of the logic.

marc-hb commented 5 years ago

Yes, we do have prior [integration] art that we were at least partially inspired from:

Thanks for mentioning these here. Again I recommend "promoting" these references from this github review to the actual text of the documentation to pre-empt future "why all-in-one?" questions like mine. I think this sort of integration is really not the most common.

I remembered seeing this one too: http://www.vestasys.org/ It was never popular but it's well documented which could help here? Dunno.

We believe that a one-stop shop tool is the easiest path forward in order to provide a simple user experience for inexperienced users and newcomers.

Probably not what you mean but this and other sentences sound like just reducing the number of top-level commands... If two different front-ends for 1. version control and 2. build are one too many front-ends for some engineers developing some IoT product, then I hope I can avoid this product! I would just avoid vague terms like "one-stop shop" and focus instead on actual examples/benefits/features that only a tighter integration of versioning+building can achieve. Providing the references above is great, summarising some of their integration benefits here would be better.

I completely agree, that's why we are not trying to crack that nut. Instead we are trying to build something that fits the specific requirements of Zephyr.

Fair enough.

Of course we might be wrong, and this might prove to be a mistake in the long run. If that is the case we will certainly revert course and acknowledge this, but in the meantime we are giving it a try.

As long as it doesn't cause much more work I would recommend keeping the multirepo part of west as generic as possible. I should spend more time diving deeper but for now I'm struggling to imagine what could be so specific to Zephyr that no other project could benefit from it too. Many generic and popular tools were born much more specialized. Leaving open options for other multirepo tools will also help, appreciate you trying not to exclude that.

Unrelated submodules PS: actual requirements, design choices and features aside, the user interface for git submodules is unproductive, much worse than git's itself. I've used submodules for long enough to remember their concepts well, yet I can never remember any single submodule command to do anything, I have to look them up every single time and so had most people I worked with. Granted: west-multirepo could possibly hide / abstract these away.

pfalcon commented 5 years ago

to pre-empt future "why all-in-one?" questions like mine. I think this sort of integration is really not the most common.

In this corner of the world it's pretty much is. And I'm personally 80% sure that there wouldn't be such an urge to have own cute tool if Mynewt didn't have it ;-). At least we won not having it written in Go or some other "emerging cute technology" ;-).

marc-hb commented 5 years ago

Create a new "meta" repository which only contains submodules to other repos. This would be equivalent to a "manifest" repository. This option would satisfy most of the requirements, but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

This is the key git submodules design choice/limitation, the one that doesn't have any good workaround. git submodules was apparently designed for the following use case: active development in ONLY ONE git repo, all other git repos being SLOW moving and very strictly controlled dependencies. That it can do. The rest not really. I've seen a large project trying really hard to use git submodules for managing multiple git repos actively developed at the same time and it was an absolute version control disaster, think complex, "home-made" wrapper scripts and hours spent on late nights and week-ends trying to track down ONE submodule mistake by one confused engineer.

but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

Yes and the only vaguely sensible workaround is automate commits in such a repo.

Google's repo is more flexible in that respect but funny enough can end up requiring the same type of automated and unusable git logs, example: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+log/refs/heads/master

Good luck searching for non-automated commits there.

tautologyclub commented 5 years ago

@marc-hb The thing is though that the submodules @carlescufi mentions fit that bill pretty niely - external vendor HALs, external cbor libs, mbedtls, etc. They're supposed to be more or less static in the context. As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun and will probably, after much time and effort spent, find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

mbolivar commented 5 years ago

Hi @tautologyclub and thanks very much for your comments!

You could either make a new project where the Zephyr kernel itself is a submodule (would be fine imo), or you could interpret "wrapper around submodules" a bit more liberally to implement some additional features -- or rather, have it present some well-defined interface (such as a specific directory structure) that the build system can infer stuff from reliably. Either way it seems to me that submodules would be pretty well suited to handle the bulk of the logic.

The devil is very much in the details here, I'm afraid.

I encourage you to try fleshing out the "have it present some well-defined interface (such as a specific directory structure) that the build system can infer stuff from reliably" idea in detail for Zephyr.

We certainly tried approaches like that (see https://github.com/zephyrproject-rtos/zephyr/pull/7338 for one example from one of Zephyr's core maintainers), and they all fell down in one use case or another.

Some other comments follow.

but it would require an additional commit on the "meta" repository every time anything is committed to any of the repos.

I claim that this can't be done in a sane way when you're integrating repositories from multiple external sources without doing what this comment by @marc-hb proposes...

Yes and the only vaguely sensible workaround is automate commits in such a repo.

... which is exactly what we (my company, foundries.io, which has been helping a bit with west) has been doing with google repo in a Zephyr project for quite some time (about 2 years).

We tried to push a similar approach upstream in west. It was dead on arrival as it was a dealbreaker for some users. We didn't learn this until quite late in the development cycle; recovering the design was a bit of a last minute roller coaster :).

They're supposed to be more or less static in the context.

("They're" above is referring to "vendor HALs, external cbor libs, mbedtls, etc.".)

I disagree about this "supposed to be". I think it's far more common that users will want to mix and match one or two components but leave the rest mostly the same. I think having a manifest file like repo or west (or a DEPS file like chromium, etc.) makes this easier than alternatives I've seen. And I must say I don't agree with the idea we should just give up if we can't make them static as described here:

As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun and will probably, after much time and effort spent, find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

I certainly do feel that we've run into many of the same design headaches they have in the past year plus that we have been working on west and trying to gather requirements for and from Zephyr users.

I suppose we'll see if we've missed any critical ones and our wings melt, or not. I am sure that the fun is far from over. I hope you wish us luck!

mbolivar commented 5 years ago

Hi there @marc-hb:

I would just avoid vague terms like "one-stop shop" and focus instead on actual examples/benefits/features that only a tighter integration of versioning+building can achieve. Providing the references above is great, summarising some of their integration benefits here would be better.

Bootloader (and in particular, MCUboot) integration is a killer app for me personally.

I've maintained some out of tree scripts for a while now that make it easier to build and flash Zephyr images that are child-loaded by MCUboot, and I can tell you from experience both using them myself and helping out members of my team that are less deeply invested in the details of the Zephyr build system than I am that it is really nice to have (especially since we are a remote work company across many timezones, simplicity of interface is a big win UX wise). A lot of the differences between boards and flashing mechanisms can be suitably abstracted away if you have a tool that understands not just the build system but also the details of flashing different zephyr boards with different backends.

We (my company) are also working on some additional tooling for doing automated testing of a multi-repo tree across multiple boards and sample applications -- think shippable but real Zephyr hardware. West integration is a nice selling point as we can rely on the above and extend it.

I would get it if that that feels like vaporware to you -- and, from what's available upstream, some of it is (though not all of it, as I've been steadily upstreaming the bootloader integration and plan on finishing the job this week once west is finally merged and part of the core workflow.) But I invite you to watch the space as there is real code somewhere in the haze :)

mbolivar commented 5 years ago

Hi @pfalcon

And I'm personally 80% sure that there wouldn't be such an urge to have own cute tool if Mynewt didn't have it ;-).

Given your winky emoticon I am not sure whether you meant this as a joke, but I can assure you I am 100% sure that newt has nothing to do with it from where I am sitting. The features provided by the Android build system (the one in AOSP for building entire images, not the IDE ones for building apps) and tools like repo, fastboot, and adb are a much bigger design influence on me.

marc-hb commented 5 years ago

active development in ONLY ONE git repo, all other git repos being SLOW moving and very strictly controlled dependencies

They're supposed to be more or less static in the context.

"supposed" and "more or less" isn't good enough. To prove that git submodules are a good fit you'd have to know how every Zephyr project is organized, including closed-source projects and... future projects. Looking at https://docs.google.com/document/d/1HrrMZ11nULWoAv3mR70VxT6nB_I1qMytpnmxFcVPCpM the intention is to very clearly support more than one git repo actively developed at a time.

As soon as you try to find a solution for subrepos that don't fit that description, you're flying very close to the sun, after much time and effort spent,

The high-level design of west-multirepo seems relatively close to Google's repo (at least much closer to it than to submodules). This does mean a fair amount of work but not rocket science either. So quite far from the sun ;-)

find that you're running into the same design headaches that the guys behind repo/submodules ran into and couldn't find an elegant solution to.

Google's repo may not always be "elegant" but it "does the job" - a massive amount of production work actually. Running into design headaches that have been already solved doesn't seem like a bad idea even if some of those solutions were not "elegant" and optimal. "Evolution not revolution"? Plus the ambition is (unfortunately...) limited to support only Zephyr for now.

mbolivar commented 5 years ago

@marc-hb I'm sorry as I meant to reply to some of your other comments in my earlier response but I forgot and hit send. Rather than edit, I'll just add another comment here:

We believe that a one-stop shop tool is the easiest path forward in order to provide a simple user experience for inexperienced users and newcomers.

I do believe that "one stop shop" is not totally as vague as it seems on the surface :). For example, the ability to do things like, say, this:

$ west --help
usage: west [-h] [-z ZEPHYR_BASE] [-v] [-V] <command> ...

The Zephyr RTOS meta-tool.

optional arguments:
  -h, --help            show this help message and exit
  -z ZEPHYR_BASE, --zephyr-base ZEPHYR_BASE
                        Override the Zephyr base directory. The default is
                        the manifest project with path "zephyr".
  -v, --verbose         Display verbose output. May be given multiple times
                        to increase verbosity.
  -V, --version         print the program version and exit

commands for managing multiple git repositories:
  list:                 print information about projects in the west
                        manifest
  diff:                 "git diff" for one or more projects
  status:               "git status" for one or more projects
  update:               update projects described in west.yml
  selfupdate:           selfupdate the west repository
  forall:               run a command in one or more local projects

commands from project at "zephyr":
  build:                compile a Zephyr application
  flash:                flash and run a binary on a board
  debug:                flash and interactively debug a Zephyr application
  debugserver:          connect to board and launch a debug server
  attach:               interactively debug a board

Run "west <command> -h" for detailed help on each command.

is a win that a variety of tools on the PATH -- no matter how carefully named or documented -- will not be able to match in terms of discoverability and ease of use. I think there is a reason why docker is a single command for dealing with containers -- and I think it's not crazy to have a single command for "dealing with Zephyr", which is also a somewhat isolated computing environment that you manage from a host system.

As long as it doesn't cause much more work I would recommend keeping the multirepo part of west as generic as possible.

Yes. We are trying. I would love this to be generic and usable as a separate tool someday too, believe me, but it's just not practical for now. But we aren't losing sight of this.

mbolivar commented 5 years ago

The high-level design of west-multirepo seems relatively close to Google's repo (at least much closer to it than to submodules)

@marc-hb you are right about this.

We basically started with a reimplementation of the minimal subset of google repo that we figured we could get away with, except:

written in python 3 instead of 2
compatible with windows (repo's internal heavy use of symlinks makes it a no go on that platform, which is a first class citizen for zephyr -- and yes, we know about https://github.com/esrlabs/git-repo)
without some of the crazy repo magic behavior (although opinions on how 'magical' west is are not uniform, in fairness)
no assuming the git remote is handled by gerrit (for things like repo upload)
edit: and YAML instead of XML. I hate XML.

If you watch this (by now very out of date) status update I gave on west to the zephyr TSC, you'll hear me admit that we tried to just use repo, but couldn't, mostly because of these issues, towards the end when I raced to recap the multirepo parts:

https://www.youtube.com/watch?v=P6s0HSZAua8

At the end of the day, we had to change tack and incorporate some submodule-style features because the free-form way repo allows the individual repositories to vary did not meet the requirements of some zephyr users.

mbolivar commented 5 years ago

@carlescufi

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories

This is in the issue description and needs an update

pfalcon commented 5 years ago

And I'm personally 80% sure that there wouldn't be such an urge to have own cute tool if Mynewt didn't have it ;-).

Given your winky emoticon I am not sure whether you meant this as a joke, but I can assure you I am 100% sure that newt has nothing to do with it from where I am sitting. The features provided by the Android build system (the one in AOSP for building entire images, not the IDE ones for building apps) and tools like repo, fastboot, and adb are a much bigger design influence on me.

Thanks for the response. Yeah, there's a bit of joke in every joke ;-). So, a case with MyNewt and its "newt" tool must be a coincidence then ;-).

Well, seriously, all in one management tools are well-known pattern in bigger IT ("python setup.py" for all things modules in Python, Django's, etc. application frameworks' management tools (from starting an app to fishing in its database)). In embedded space Zephyr isn't the first either. MyNewt is an obvious affinity suspect, but then there're also mbedOS' yotta build/package management tool, and PlatformIO which is built on this concept.

carlescufi commented 5 years ago

@carlescufi

That does not mean that all of the code needs to be in a single place, and we are currently looking at an extension mechanism that would allow to place the implementation of different west commands in separate repositories

This is in the issue description and needs an update

Fixed, thanks!

carlescufi commented 5 years ago

Thanks @mbolivar for further commenting on the process that has led us here.

I would like to further clarify what has already been said, especially given the latest feedback by @tautologyclub and @marc-hb: west has been designed around a set of requirements and needs from some of the contributors to the Zephyr project. In particular, and as is described in this issue and in the official west documentation, we have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today. It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required. It is also fundamental that we attract users, and again in the embedded world this means supporting Windows as a first-class citizen and also providing an interface that is as simple (yet powerful) as possible. Finally, and as you may have read about already, Zephyr is also about security and safety. We are in the process of trying to FuSa-certify Zephyr, and that imposes its own set of requirements in how the code is presented and can be split. All of these reasons have led us to the conclusion that we needed a tool. We certainly never wanted to develop a tool, we want to write embedded software and support as many SoCs and technologies as possible. We studied Google repo and Git submodules extensively, and simply came to the conclusion that neither would be able to fulfill our requirements.

Vudentz commented 5 years ago

Id suggest adding another requirement:

All tests run by CI must have its dependencies, if any, as submodule, so individuals can run those tests locally without having to switch their manifest file.

pabigot commented 5 years ago

we have focused on being able to retrieve a subset of repositories without having to modify the upstream tree, on the ability to maintain downstream forks that replace a subset of repositories and on full support for Linux, macOS and Windows. There is a reason we are doing this, and it is exclusively related to trying to adapt to how the embedded world deals with software today. It is of critical importance that, in a project that intends to provide a one-stop-shop solution for embedded development, we make it easy for distributors, silicon vendors and product developers to easily replace bits and pieces with proprietary software or forks of open source projects where required.

This is the most clear and compelling core requirement and justification related to multi-repo support I've seen. It's also the first time I've seen it stated this way.

I'm not entirely convinced that it couldn't have been satisfied with existing solutions, nor that it would withstand a rigorous validation process, but it does at least provide a basis for assessing potential solutions.

However, it's long past the point of requirements specification: we have what we have and mighta/coulda/shoulda gets us nowhere. My intent is to give west a shot and if it proves to have problems work to resolve them, develop an alternative solution, or move on in some other way.

carlescufi commented 5 years ago

@pabigot

This is the most clear and compelling core requirement and justification related to multi-repo support I've seen. It's also the first time I've seen it stated this way.

Thanks, I will add this to the issue description then.

carlescufi commented 5 years ago

Id suggest adding another requirement:

All tests run by CI must have its dependencies, if any, as submodule, so individuals can run those tests locally without having to switch their manifest file.

Agreed, will add.

carlescufi commented 5 years ago

@Vudentz @pabigot I added the following requirement (which was omitted from the list by mistake):

Ability to bisect the main upstream zephyr tree carrying along exact revisions of the projects during the bisection. This implies tracking projects using exact SHAs upstream.

tautologyclub commented 5 years ago

I disagree about this "supposed to be". I think it's far more common that users will want to mix and match one or two components but leave the rest mostly the same. I think having a manifest file like repo or west (or a DEPS file like chromium, etc.) makes this easier than alternatives I've seen. And I must say I don't agree with the idea we should just give up if we can't make them static as described here:

Well, the key point is "mixing and matching", not "doing active development on". It's active development that becomes annoying when using submodules/repo. Cloning an external HAL and some helper libs that you don't intend to modify can surely not be seen as a nightmare using existing tools...?

I suppose we'll see if we've missed any critical ones and our wings melt, or not. I am sure that the fun is far from over. I hope you wish us luck!

Absolutely, not trying to bring you down - it's certainly odd that no one has managed to find a good solution to this problem given the wide audience and the awkwardness of all existing tools. What I'm trying to argue for is that perhaps your needs could be satisfied WITHOUT reinventing the wheel and instead wrapping existing wheels with some python :P

mbolivar commented 5 years ago

It's active development that becomes annoying when using submodules/repo. Cloning an external HAL and some helper libs that you don't intend to modify can surely not be seen as a nightmare using existing tools...?

Speaking personally as someone whose company chases tip on multiple projects and usually carries out of tree patches in forks of those repositories (that rebase regularly, because those patches make their way upstream often), I do intend to modify, frequently. So while I can see your point for the "doesn't change often" situation, I don't think it covers all the users.

What I'm trying to argue for is that perhaps your needs could be satisfied WITHOUT reinventing the wheel and instead wrapping existing wheels with some python :P

Sure, and I understand that. As you've said, nobody seems to have a really good general solution to this problem, though. We're just scratching our own itch.

To echo @carlescufi, this was definitely not our first choice. If I really believed existing tools could do the job, I would have advocated for them, but I don't believe that's the case. If that turns out to be a mistake, I'm all on board to learn from it and move on. But this is the best way forward that I can see right now.

Thanks again for your feedback.

tautologyclub commented 5 years ago

Well, let me further argue. Here's the arguments against submodules:

[We'd have to add] submodules to the main zephyr repository. This would not meet some of the requirements, in particular the ability to retrieve only a subset of repositories, since the paths and existence of those would be hardcoded.

I can't see how this holds if you allow the plumbing to be submodules but the porcelain to be west. I'll throw up some imaginary commands to reflect upon:

// start from scratch
git clone https://blala/zephyr.git
./west-setup.sh

// create a new branch foobar, with a separate manifest.yaml
west checkout -b foobar

// parse .gitmodules and adds a new entry, also parse/add to manifest.yaml
west add https://github.com/foobar/ ext/lib/foobar 

// perhaps also allow for non-path arguments, when the repo to be added is well-known/supported upstream
west add tinycbor

// or perhaps allow argument to be a manifest file
west add samples/net/wap_mesh/manifest_deps.yaml

// NOW we sync
west sync

To reiterate, I can't see how submodules + thin wrapper doesn't accomodate requirements.

pfalcon commented 5 years ago

[We'd have to add] submodules to the main zephyr repository.

To start (well, continue), this is a misconception. To use submodules, you don't need to add them to the main Zephyr repo. Anybody can add submodules to their fork/clone. Anybody can update to any revision of a submodule in their branch. Anybody can replace an existing submodule with something else. When they do 2 of the last actions, they expectedly will get a conflict when upstream also updates their submodule definition. To be fair, I dunno how conflict resolution is handled in that case, but I bet that in git 2019, it's done much better than in mytool-v0.NIH-beta. And of course, someone doesn't have to checkout all submodules. One can chose only those that needed.

The whole subconscious idea here is that there must be something hard in git. It stems from those time when cvs-, at most svn-, familiar developer stood on the entrance of git. Eerie sounds and flickering. A year after, nothing's eerie in day to day git, but something must eerie must be lurking someone in the corner! Myth of frightening git submodules serves that role. The whole myth is based on mixing up "tracking and matching multiple projects is hard" with "git submodules are hard".

That said, all this discussion is rather theoretic now, given that "west" was merged into the mainline.

marc-hb commented 5 years ago

The whole myth is based on mixing up "tracking and matching multiple projects is hard" with "git submodules are hard".

I think you missed my comment about submodules above, the one where I referred to real nights and real week-ends. Plus a few other thousands people sharing their real-world experience on the Internet. All too stupid to use submodules? Most likely! A "myth": certainly not.

For git itself the best summary is this: https://stevebennett.me/2012/02/24/10-things-i-hate-about-git/ It describes perfectly my years of real-world experience as "the local git support desk" for real people across multiple projects, many of them actively (and of course wrongly) not interested in version control. Off-topic sorry.

marc-hb commented 5 years ago

A lot of the differences between boards and flashing mechanisms can be suitably abstracted away if you have a tool that understands not just the build system but also the details of flashing different zephyr boards with different backends.

I've never wondered about the value of this type of tighter integration, sorry for any confusion. I don't know much about it but it seems to make a lot of sense to me. The only "integration" question I still have and that you haven't answer yet is very specific and unfortunately getting lost in other, vaguer discussions. The value that is really not obvious to me (yet?) is just the supposed "integration" of:

the new multirepo tool which hasn't showed any obvious sign of being limited to Zephyr yet;
all the Zephyr-specific rest.

Making version control for Zephyr artificially specific to Zephyr actually makes me wonder about potentially negative value because it means implementing things like Continuous Integration for instance become potentially more Zephyr-specific too which means more work for a smaller community. For instance:

We (my company) are also working on some additional tooling for doing automated testing of a multi-repo tree across multiple boards and sample applications...

I do believe that "one stop shop" is not totally as vague as it seems on the surface :). For example, the ability to do things like, say, this: [west --help]

Very interesting! The output of west --help is very clearly split between TWO sections: 1. one for multirepo commands 2. the other section for the Zephyr-specific rest. This tends to show that these two sections could be TWO (not "a variety") separate tools for no usability or user-friendliness difference.

I think there is a reason why docker is a single command for dealing with containers

https://docs.docker.com/engine/reference/commandline/ has a fairly large number of commands but they're all (closely) related to managing images and containers and none of them deals with versioning. "docker history" is a log file and checkpointing system images is very far from versioning code edited by humans.

I would just avoid vague terms like "one-stop shop" and focus instead on actual examples/benefits/features that only a tighter integration between versioning + all the rest can achieve. Providing the references above is great, summarising some of their integration benefits here would be better.

In other words and despite all the digital ink above I've seen no clear rationale yet for thinking too small.

pfalcon commented 5 years ago

the one where I referred to real nights and real week-ends

Here on github I see dozens of people having dozens of cases of not being able to rebase or fix merge conflict correctly. What does it tell us?

the best summary is this: https://stevebennett.me/2012/02/24/10-things-i-hate-about-git/

This? I thought this: https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/ , where a google employee explains the reasoning of creating ugly tools like "repo", though thru all that a confession rings along the lines of "the best thing we've come up with is one giant monorepo where we commit half of the world, flat". Of course, there're still sleepless nights for support staff - that's unavoidable part of enterprise software development. But at least git submodules can't be blamed.

mbolivar commented 5 years ago

@marc-hb:

the new multirepo tool which hasn't showed any obvious sign of being limited to Zephyr yet; [...] In other words and despite all the digital ink above I've seen no clear rationale yet for thinking too small.

Code freeze for Zephyr v1.14 LTS is this Friday. We barely squeaked by getting west merged into master with a bunch of Zephyr-specific assumptions baked into it. We will be supporting LTS for over a year and west missing this deadline would have been a Big Problem.

I'm all for generalizing the multi-repo pieces of it when we have time, but we just don't right now. And the haters implying we're too dumb to realize we should have been using submodules all along, or are afraid of git or something, may well be right. And there really are a variety of things in west which are zephyr specific.

the default manifest URL for west init is https://github.com/zephyrproject-rtos/zephyr
the default URL for the west that gets cloned into an installation is https://github.com/zephyrproject-rtos/west
the tool looks for a project in the manifest whose path in the installation is "zephyr" and sets the ZEPHYR_BASE environment variable to that repository's absolute path for the duration of the call, so that the CMake invocations performed by west build and other commands will Just Work, keeping pervasive build system assumptions from having to change without requiring users to set ZEPHYR_BASE themselves (which we have learned in user testing is too hard for big classes of people, like Windows users, i.e. the vast majority of embedded developers)
the west package provides a variety of Python modules which individual west extension commands (about which more below) will rely on being there which have all sorts of Zephyr-specific things inside, like details about the build system's usage of the CMake cache.

Could we have cleanly separated these things into their own Zephyr-specific and generic portions, and uploaded the non-generic portions to PyPI in their own clean zephyr_rtos package, which would have imported the generic west portions appropriately? Definitely! And then we would have missed the LTS deadline, which has already been extended by several months compared to our usual release cadence :).

To all the requests for a generic tool, I think the only real response is "yes, we know, we'd like that too, all in good time if it's appropriate".

I hope the above helps clarify why.

The output of west --help is very clearly split between TWO sections: 1. one for multirepo commands 2. the other section for the Zephyr-specific rest. This tends to show that these two sections could be TWO (not "a variety") separate tools for no usability or user-friendliness difference.

There isn't any documentation up for this yet (we're working on it) but the big idea here is that any repository can provide west extension commands and they will show up here. For details, see the comments in our manifest file pykwalify schema describing each project entry:

https://github.com/zephyrproject-rtos/west/blob/master/src/west/manifest-schema.yml#L73

And the related schema for the files which each project which declares extension commands in the manifest must then provide:

https://github.com/zephyrproject-rtos/west/blob/master/src/west/commands/west-commands-schema.yml

So this is in fact arbitrarily many extension commands from arbitrarily many repositories, as determined by the corporate user, SoC vendor, Zephyr downstream distributor, individual hacker, other random upstream project in the manifest (like net-tools today), etc.

They will all show up in a unified way in the west -h output as a result of parsing the manifest file, which can be in any repository -- just pass the repository URL to west init -m <URL> and hey presto you've got a custom Zephyr derivative with your own west commands inside that are all discoverable by any user who is familiar with the upstream tool without having to open your documentation at all. Their implementations can all rely on some common infrastructure in the west package. Now, this is not 1.0 software and we've definitely taken on tech debt trying to get there that will require some refactoring and a bit of interface breakage to clean up eventually, but it is an actual generic mechanism for doing Zephyr development, not an arbitrary confluence of exactly two unrelated things.

We've been getting this "it's just two sets of commands" since the beginning, but it's really not, promise.

Hopefully the above starts to make clear why.

marc-hb commented 5 years ago

I'm all for generalizing the multi-repo pieces of it when we have time, but we just don't right now.

Thanks! This doesn't match the initial impression(s) the current documentation gave to me and probably others. There's actually a ton of useful information that you and others just shared in this thread that should really be rolled back into the documentation in some shape or form when you can find the time.

On one hand I feel somewhat guilty we just pressured the west developers to justify and share all this while you were working hard on a still evolving implementation and trying to make a deadline. On the other hand I'm happy all these rationales and information were elaborated and shared here because not just the documentation about "how" but also the "why" is extremely important for features affecting the top-level user interface and workflows that much.

I hope you wish us luck!

You bet!

mbolivar commented 5 years ago

There's actually a ton of useful information that you and others just shared in this thread that should really be rolled back into the documentation in some shape or form when you can find the time.

Absolutely. I am very committed to making that happen for v1.14. Just needed to get the code in first -- zephyr's release management policies make it possible to write documentation after code freeze, so there's a lot of documentation still to be written that unfortunately is just in the heads and meeting minutes of the people that have been working on this.

You bet!

Thanks! I am glad this thread has been useful so far. I will CC you on any documentation PRs. Your feedback has been extremely helpful and I hope you will have time to look at the docs patches too.

zephyrproject-rtos / zephyr

Multiple Git Repositories #6770

Motivation

Requirements

Conclusion

FAQ

Why a single tool?

Why is it called west?

Why Python?

Why not use Google's repo

Why not use Git submodules?

Unresolved issues

TL;DR

My experience

Suggestions

Why is it called `west`?