nerves-project / nerves

Craft and deploy bulletproof embedded software in Elixir
http://nerves-project.org
Apache License 2.0
2.26k stars 193 forks source link

Mix Firmware.burn suddenly stopped working w/o a change in Elixir or OTP #447

Closed s3vus closed 5 years ago

s3vus commented 5 years ago

Environment

Elixir 1.8.1 (compiled with Erlang/OTP 21)


* Nerves environment: (`mix nerves.env --info`)

|nerves_bootstrap| Environment Package List

No packages found |nerves_bootstrap| Loadpaths Start

Nerves environment MIX_TARGET: host MIX_ENV: dev

NERVES_SYSTEM is unset NERVES_TOOLCHAIN is unset |nerves_bootstrap| Environment Variable List target: host toolchain: unset system: unset app: /projects/failure_pile

|nerves_bootstrap| Loadpaths End

* Additional information about your host, target hardware or environment that
  may help
MacOS X using `asdf`

### Current behavior

I mentioned this in the slack channel:
https://elixir-lang.slack.com/archives/C0AB4A879/p1565553797024500

A week ago, I had done a `mix nerves.new` and was able to burn firmware.  The next week, it would still let me build a new nerves project, but doing a `mix firmware` fails, even though I had changed nothing on my machine.  This was explained to me that the new mix project suddenly required new versions of elixir/otp, even though the new project didn't have any major dependency version changes (they were all minor.)  

Here's the error:

Nerves environment MIX_TARGET: rpi2 MIX_ENV: dev

** (Mix) Major version mismatch between host and target Erlang/OTP versions Host version: 21 Target version: 22

This will likely cause Erlang code compiled for the target to fail in unexpected ways. Install an Erlang OTP release that matches the target version before continuing.



### Expected behavior
I would expect that if mix/nerves lets me build a new project with my current version of elixir and erlang, it would also let me burn the firmware.  I was able to work around this problem by either upgrading my elixir and erlang/otp, or by grabbing an old copy of my `mix.lock` file from an older project.  But this isn't the best developer experience and may hurt productivity in situations where the developer needs to build a project using older versions of elixir/otp/erlang.

I really feel like if mix nerves will let you build a project and the firmware one week, it should let you do it the next week, if you haven't changed anything on your machine, given the same versions of elixir, erlang/otp. Also, with the `mix firmware` command suddenly not working, a lot of tutorials and blog articles will break, making their instructions no longer repeatable, which hurts the community and especially makes things more complicated for beginners.  

IMHO, If dependencies require a new version of elixir/otp, than that should be a major version change, not a minor.  

At the very least, if we're not going to allow them to build the firmware, doing `mix nerves.new` should error out so we fail early.  Why let them create a new project with that version of elixir, if they can't do anything with it?

Thanks everyone.  
fhunleth commented 5 years ago

Right. This is expected. We don't lock dependencies in mix nerves.new, so there's always a chance for dependent project versions to change.

The Nerves System dependency is what changed for you.

We've struggled significantly with this. Semantic version doesn't work for Nerves systems since they include Linux kernels, C libraries, various utilities and more. From a semantic versioning standpoint, nearly every update to a Nerves system we make contains an API breaking change. It could be something that most Nerves users don't care about like Linux, U-boot, OpenCV, etc. changed an API. We might not even know. For example, the Raspberry Pi Foundation distributes closed source firmware binaries and doesn't version them. I pull them based on git hashes and sometimes I struggle to tell how significant or insignificant their changes are. The list goes on.

What we do for Nerves Systems is this:

  1. If there's a change that breaks integration with Nerves tooling or results in firmware images that can't be upgraded to from previous releases, that's a major change. These break shipping devices. It is expected that significant work is needed to upgrade a fielded device, and it would be unlikely to be able to roll updates back once that is done.
  2. Changes in the version of Buildroot, Erlang, Linux, etc. are minor changes. These updates may involve work to update your application, but the expectation would be that you could deploy firmware to existing devices in the field and roll it back.
  3. Bug fixes bump the patch number.

What you saw was the version of Erlang change. That's minor since it's possible to upgrade and downgrade firmware in the field and the tooling still works.

As an aside, I think that it's somewhat reasonable to expect a newer version of Erlang to run .beam files compiled by an old version of Erlang. That kind of works, but we have been burned enough times by subtle issues that we unconditionally raise an error. While I can completely appreciate that this is frustrating, our current alternative of allowing it and then trying to figure out what happens afterwards isn't great either.

There are answers here, but the ones I know of (breaking apart Nerves systems so that so many concerns aren't lumped together) are a lot of work. For the time being, having major version bumps represent completely incompatible changes is too valuable to lump smaller changes like Erlang/OTP bumps in with them. I'm not saying that we're closed to alternatives, but there's some inertia with the current scheme.

s3vus commented 5 years ago

So....totally understand the effort involved and the complexities. I guess I would consider this tech debt and that in a utopic future, a change in erlang or linux would be a major change, not a minor.

Would a stop gap to error out when someone does a mix nerves.new? Because they won't be able to build anything anyway? Or is it possible that Erlang might work for rpi0 but not rpi2 or something, because the dependencies change based on which device you are building for? In which case, the status quo is annoying but makes sense.

Would it possible to have a more clear error message for dummies like me?

GregMefford commented 5 years ago

That's correct - each system version can determine the required OTP version, so even within one project, you could potentially have a breaking change just by switching your MIX_TARGET where it will no longer build until you update to a different version of that system.

I can see your point that an OTP major version causing a major version bump of the system could potentially make sense given that it causes an obvious Mix-level error for the user at compile time when you have the wrong one. Essentially, requiring a different OTP version is breaking API compatibility at the Mix level as opposed to potentially breaking your application itself.

Frank is right though that from a semver perspective, basically all bets are off on just about every release as far as the underlying operating system and packages, and decoupling that somehow is going to be impractical.