[Library] Automate quality assurance testing of libraries

s-celles commented 6 years ago

Hello,

In #6646 I mentioned that library manager should display license for each library.

I think a more general issue is that Arduino should be able to automate quality assurance of libraries. http://downloads.arduino.cc/libraries/library_index.json could help

The presence of a license file is one point, but assurance that the code is still compilable is an other important point. Then, assurance, thanks to unit tests, that library have some chance to be still usable is a last point.

In JuliaLang we have https://pkg.julialang.org/ which list all libraries (outside of any IDE). So, thanks to continuous integration and this quality assurance testing, we know if a library is compilable or not and if tests of a library are passing or not...

https://github.com/ianfixes/arduino_ci provides scripts to ensure that when a commit is made, it doesn't break library (prevents him from compiling) and that unit tests are still passing.

All this may be insufficient, we could still have broken eggs, but that's a begining.

Require developers to tag their release (with semantic versionning) and set up requirements (with a minimal version number for each dependencies) in a configuration file could be an other important point (discussed in #5795 )

Kind regards

s-celles commented 6 years ago

Maybe some developpers with knowledge of C/C++ dependency manager could comment on the last point about requirements.

Code linting could also be checked.

A last, but important part of code quality assurance is code coverage but we are probably still far away from this point.

per1234 commented 6 years ago

Here's the only way I can see this working:

When the Library Manager indexer job finds a new tag it looks for the CI configuration file for any standard CI service (e.g. Travis CI, Circle CI, GitLab CI) and runs (or checks existing?) CI build for that tag, then records the result of that build in http://downloads.arduino.cc/libraries/library_index.json.
Filter checkboxes are added to the Arduino IDE's Library Manager GUI:
- Tested: CI build passed.
- Untested: No CI build was done.
- Test Failed: CI build failed.

The creation of the CI configuration file and which tests will be done on the library will be up to the library maintainer. That could include compile testing, unit testing, and linting. It is not feasible for Arduino to generate tests for 3rd party libraries.

s-celles commented 6 years ago

As written in https://github.com/arduino/Arduino/issues/6646#issuecomment-392063761 detect license inconsistencies (ie a different license in library.properties and LICENSE file) could be part of the Arduino libraries quality assurance process

s-celles commented 5 years ago

Pinging @ianfixes about unit tests / continuous integration for Arduino libraries

ianfixes commented 5 years ago

Thanks for bringing me into this discussion, and I'm somewhat flattered by this attention towards my testing library. I still consider arduino_ci to have an alpha-level feature set, but so far the concept has proven to be fairly solid.

Unit testing works, period. There's room for the mocks to become more elegant, but there's a lot already supported (e.g. faking a hardware modem via serial). You do not need hardware present to run the unit tests, regardless of architecture.
Memory management issues can be exposed and annotated (via compiler features): https://github.com/adafruit/Adafruit-WS2801-Library/pull/20.
Library examples can be targeted to specific architectures via the config file, as an alternative to using preprocessor directives to work around compilation issues.
Extra architectures (any board that can be installed via the arduino CLI) can be declared and configured via the config file
Dependent libraries are installed automatically (although for now they must be declared manually in the config file)

Metadata tests (e.g. your comment about the LICENSE file) don't exist now, but I could certainly integrate them into the ruby side of the test runner. That would either be its own standalone script (similar to arduino_ci_remote.rb, the entry point for CI), or baked into that script and controlled via config.

Should we pick an Arduino library to use as a demo?

per1234 commented 5 years ago

@ianfixes do you agree with my previous statement?:

The creation of the CI configuration file and which tests will be done on the library will be up to the library maintainer. That could include compile testing, unit testing, and linting. It is not feasible for Arduino to generate tests for 3rd party libraries.

Or do you think there is a way to do completely automated unit testing via arduino_ci without any configuration on the part of the library author?

As for testing of metadata and library structure, this is possible to do automatically (and in fact I'm already doing that independently). It's actually even more simple with the libraries in the Library Manager index because we know they're in the repository root and thus don't need to deal with the headache of searching for the library in subfolders. Here is the bash script I use to check metadata and library structure: https://github.com/per1234/arduino-ci-script/blob/1.1.0/arduino-ci-script.sh#L1212-L2326

The Library Manager indexer actually already does a lot of verification of the library.properties and some of library structure (must be in repo root, must not contain executables). Any release that doesn't pass its verification is not added to the Library Manager index

PaulStoffregen commented 5 years ago

It is not feasible for Arduino to generate tests for 3rd party libraries.

Why not? Anyone can select a widely used board, manually open each of library's examples and click Verify. Why can't a CI script do the same?

And really, why isn't doing substantially more considered to be feasible? Even a rather elaborate test which runs the libraries on real hardware and compares against a known-good result is the sort of thing you might expect take 1 person-month to initially create, and then maybe 2-3 hours per library to set up. While there are thousands of libraries, only a few hundred are very widely used. I just don't understand why something like this is assumed to be so far out of reach, when it could be done pretty effectively by a few summer interns working with direction and occasional assistance from a seasoned engineer / developer.

ianfixes commented 5 years ago

It is not feasible for Arduino to generate tests for 3rd party libraries.

For unit tests, I agree completely: the library maintainers would need to take full responsibility for writing tests as well as maintaining any given CI solution as a whole. Period.

For compiling the examples provided by the library, I disagree slightly: anyone can load & compile those (via the GUI or CLI), including a server-side process. That said, such an effort would fail since not every library works for every architecture. You would need some configuration file format to sort that out (I can humbly offer my own YAML format for consideration), and I can't say whether it would be worth it to roll that out and then support it.

The same reasoning would apply to adopting a system like NPM modules or RubyGems -- formats that include a metadata field for the git repo URL. Although this could enable querying the a given commit to see test status (e.g. on GitHub), this would be impractical. You'd need to correlate the library with a repository, and the library version with a specific commit, and it would rely on a developer base that is not primarily software engineers to spend time maintaining it.

But if you'll allow me take a step back... I took the spirit of this issue to be "enable automated testing" -- that's a prerequisite, no matter who performs the QA¹. A few things currently block such testing, which I had to work around or simply brute-force my way though as I built my library. I would love to work with you to make a more official solution to these.

I guess my question for you would be: What impact would it have on the Arduino ecosystem if we could give library maintainers more confidence in accepting code contributions (i.e. if they could easily set up pull-request testing on GitHub, to automatically verify those contributions)?

¹ minor nitpick @scls19fr : it's "Quality Assurance", not "Insurance"; not sure if that will affect the searchability of this issue

ianfixes commented 5 years ago

@PaulStoffregen

Even a rather elaborate test which runs the libraries on real hardware and compares against a known-good result is the sort of thing you might expect take 1 person-month to initially create, and then maybe 2-3 hours per library to set up

See my Arduino CI library: https://github.com/ianfixes/arduino_ci

It mocks all aspects of the hardware for unit testing (including clock), does not require hardware to run (i.e. can be run on Travis or Appveyor CI), and takes roughly 0.5-2 hours to set up on any given library.

That setup and maintenance should fall to the individual library maintainers, not the Arduino team.

s-celles commented 5 years ago

Let's hope @ladyada @sdesalas @bguest (and others) will merge your PRs @ianfixes... because you did a very important work on this task... but a collective effort is required to have better libraries for Arduino (ie at least compiled at each commit and also tested).

I understand perfectly that the lack of answer in some quite old PR is very frustrating. Without this collective move toward quality assurance (and not quality insurance sorry for my mistake) I fear that all this is "work for nothing", which is quite discouraging... especially when it's volunteer work.

However, it is the responsibility of the Arduino community to highlight well-tested libraries.

PS : https://github.com/SMFSW/Queue is an other Queue library for Arduino from @SMFSW maybe he could try you CI script (https://github.com/SMFSW/Queue/issues/4 )

ianfixes commented 5 years ago

Submitted https://github.com/SMFSW/Queue/pull/9/files

per1234 commented 5 years ago

Why not? Anyone can select a widely used board, manually open each of library's examples and click Verify. Why can't a CI script do the same?

Many libraries/examples are written for a specific board (e.g. Leonardo for HID, Mega for the memory). It's not enough to just know the architecture. How do you distinguish a failed test due to compiling for the wrong board vs. a failed test because there's a bug? The library author would need to provide metadata of some sort to the testing system.

Many libraries have dependencies on other libraries. Many libraries have dependencies on a specific version of another library (ArduinoJson 5 vs. 6 is most prominent right now). The library author would need to provide metadata for library dependencies.

You would need some configuration file format to sort that out

Exactly my point. It seems to me that the only way compile testing is going to work is if the library author takes the initiative to configure it. We already have well established and widely used systems for this (e.g. Travis CI, Circle CI). There's no need to reinvent the wheel. This is why I think the best solution for compilation/unit testing would be to simply use the repository's existing CI build. The library author has full discretion as to how that is set up and which tests are run.

I'm not trying to argue against this proposal. I think it's great. I'm just trying to narrow this down to something that's actually feasible to implement, rather than some head in the clouds "wouldn't that be cool" feature request that sits open in the issue tracker forever.

It is feasible for Arduino to do automated checking of metadata and library structure.

For unit tests, I agree completely: the library maintainers would need to take full responsibility for writing tests as well as maintaining any given CI solution as a whole. Period.

I assumed that would be the case but thought I should verify that with you, the Arduino unit test expert. I already knew that it was not possible to do completely automated compile testing without any configuration from the library author.

I took the spirit of this issue to be "enable automated testing" -- that's a prerequisite, no matter who performs the QA¹.

By that do you mean your arduino_ci project, or some changes to this repository (or more likely the Library Manager indexer script)?

I would love to work with you to make a more official solution to these.

I am very grateful for your work on arduino_ci and I would hope to be able to contribute to the project in some way. To be honest, I've never gotten to the point of even running a unit test with arduino_ci. It's definitely high on my "to-do" list. For now I've been watching your repository in hopes I'll become more familiar with the work via osmosis.

I guess my question for you would be: What impact would it have on the Arduino ecosystem if we could give library maintainers more confidence in accepting code contributions (i.e. if they could easily set up pull-request testing on GitHub, to automatically verify those contributions)?

Of course anything done in that direction is an extremely valuable contribution to the Arduino project. However, let's try to keep this discussion focused on what can be done on Arduino's end, rather than getting into general QA.

ladyada commented 5 years ago

hihi @scls19fr - we manage our our CI script here https://github.com/adafruit/travis-ci-arduino it works well, runs a wide variety of boards, has caching thanks to some contributions and also runs our doxygen generation. we don't know ruby so we'll probably stick to the script we've been using for a year :)

s-celles commented 5 years ago

@ladyada I think https://github.com/adafruit/travis-ci-arduino only insure that code can still be compiled... not that it's working as expected (no unit tests). Maybe that's enough for most of your libraries, but it's not an enough way of testing many libraries

@per1234 is right about library dependencies... so I'm pretty sure that the question of using a dependency manager should be opened.

As we don't want to reinvent the wheel we should probably consider using one instead of building one.

Googling, I've found:

Reading https://www.reddit.com/r/cpp/comments/8t0ufu/what_is_a_good_package_manager_for_c/ can help but there is probably in the Arduino community some people with experience of using such tools (because I don't consider myself enough experienced with such tools).

I just don't really know why we couldn't have simply a YAML file with dependencies names and version number restriction (using semver versioning).

ladyada commented 5 years ago

@scls19fr hi that is correct, we catch non-compiling code across a wide range of build platforms - it is not possible to do full unit tests on our libraries because they all depend on hardware. most arduino libraries do. to do unit tests would require multiple arduino boards sitting somewhere with hardware attached. that's not feasable.

s-celles commented 5 years ago

Mocking can help.

As it's depending on hardware, we can't expect a 100% code coverage by unit tests (and we currently don't have tools to measure code coverage for Arduino libraries).

We can only expect to have some functions be tested (but not all functions of a lib).

It could only be better than current situation.

per1234 commented 5 years ago

I'm pretty sure that the question of using a dependency manager should be opened.

It already has:

I just don't really know why we couldn't have simply a YAML file with dependencies names and version number restriction (using semver versioning).

Because it's a ton of work and it's reinventing the wheel. What you're talking about is already done in .travis.yml for hundreds of Arduino libraries. That has the huge extra benefit of also providing CI.

Mocking can help.

@scls19fr you're getting this thread off topic. Let's restrict this discussion to what can be done on Arduino's end, rather than getting into general QA. If you want to have that discussion, we can do it on the Arduino forum or on Adafruit's repo.

s-celles commented 5 years ago

Pinging @cmaglie as he have been working on dependency management in https://github.com/arduino/Arduino/pull/6004

ianfixes commented 5 years ago

it is not possible to do full unit tests on our libraries because they all depend on hardware

@ladyada I'd amend that to "was not possible". I'd be glad to move the discussion on that to this Adafruit FONA PR, where the unit tests I wrote are already running (via Travis). For sketches though, I agree with you.

let's try to keep this discussion focused on what can be done on Arduino's end, rather than getting into general QA

Agree completely, the scope and @-mentioning here seems a bit broad to me. Hopefully I can focus this down a lot by just describing my current approach to automation. I'll bold the concrete things that the Arduino project could do to make this process more robust. And you've got to know in advance how sorry I am for how long this will be.

Enabling automated compilation tests (on sketches)

Before you can even start, you have to figure out what examples exist and what architectures the user might want to compile them on. A structured configuration file format is needed to handle this; I chose YAML for mine (since Travis CI uses that) and laid out a basic structure for it. The structure must be able to specify platforms (board & package, board manager URLs, etc) that can be downloaded and installed via the IDE.
Whatever boards & packages are needed for testing must be downloaded. It would be nice if I could easily query the IDE for which ones are already installed; currently, I must blindly attempt installation and then parse the output of any failures so that I can guess whether they were due to some actual problem or failed because the board was already installed. That's frustrating. JSON output for this would be awesome; it's structured, parseable, and if more fields get added to the output in the future, it won't break anything that reads it.
You need to install any dependent libraries manually -- there's no way to grab them recursively like a ruby gem or NPM module. That's probably for the best (at the moment) because the nightmare scenario here is having to fix a bug in one of the dependencies while you're adding a feature in your own library. Auto-downloading libraries by name only would eliminate the ability to work on a beta version of a dependent library.
You need to iterate over combinations of examples and desired target architectures, which generally means at least one switch of the board that is in use. Switching boards takes several seconds, so it would be nice if I could check which board was currently in-use before attempting to switch to it. Again, JSON output for the state of the IDE would be awesome. If the delay involved in switching boards could be eliminated, I'd rather specify the board and example in one single command.

Enabling automated unit tests (on libraries)

Preambles:

I take as a given that the unit test must run on a CI machine; this means the IDE's gcc-avr compiler isn't going to work.

As I said above, I don't think unit tests on sketches are happening anytime soon. You'd need a fully-deterministic, fully-scriptable way to emulate each Arduino board and a way to coordinate your compiled code with the real-world inputs you want to fake. And even then, it's unclear to me how things like interrupt handling would interact with the concept of a linear sequence of unit tests. (If you've got that in the works, you have my great respect and admiration.)

So I set out to do the next best thing... which is also difficult. Spoiler alert, we know in advance that we will have to mock the hardware.

Again, I have to figure out what platforms the user might want to test the library with. Since we won't have hardware, this is more about being able to define the proper constants via compiler flags. That means I need to be able to look up which constants must be defined for each target architecture (things like what registers are available), so that any #ifdef directives act as expected.
I won't have access to anything that the microcontroller does, so I need an implementation of all the built-in AVR functions like the ones for math and strings (and streams, etc etc). They need to be drop-in replacements such that we fake out the compiler. This is a huge pain because you can't just import the standard library without breaking a ton of stuff like, well, math and strings. Ultimately I just wrote my own implementations or adapted academic examples.
I won't have access to any of the constants that Arduino itself provides (e.g. HIGH, LOW, and a whole bunch of registers). I need a large amount of what's in the arduino libc headers directory that ships with Arduino. But I had to hand-edit a bunch of it to overcome some compiler errors. That felt a bit soul-crushing and I dread to think of how (as a 3rd-party) I'd keep that up to date.
I won't have access to any of the hardware-baesd functions that Arduino itself provides (e.g. digitalWrite(), tone(), millis()). That's not the end of the world, because all built-in functions need to be made into mocks. Oh, and those mocks need to be able to queue up incoming data and remember outgoing data just in case your library does a lot of reads or writes all at once. I used a construct called GODMODE to do this, and it's still missing a few functions that I'm lazy about adding.
You need to install any dependent libraries. Just like above. But it gets worse.
You need to figure out what to tell the compiler to actually compile. I do a brute-force of directories to include and make a very clumsy attempt to specify all the files that I think need to be compiled. It would be better if I could just ask the IDE what files it would compile and in what order they should be specified (dependent libraries and all) when I need to compile the unit tests
Oh yeah, you need a unit testing framework. I adopted most of what mmurdoch/arduinounit has; his work is an excellent use of macros. That needs to get pulled into the compilation along with all your hardware mocks and AVR replacements.
And of course, you need to write the unit tests themselves. This means you need some system or convention that says where the unit tests can be found. I assume mine will live in a directory called test/ inside the library, and I've suggested in a few places that the Arduino 1.5 library spec be amended to include a whitelist of a tests directory.
Because each group of unit tests effectively has to exist as a compiled executable, an overarching test runner is needed to handle all this iteration installation. I used ruby to write mine, but it's 100% inspired by Adafruit's travis-ci-arduino.
g++ doesn't come with the IDE, so I attempt to use a g++ that's available on the target system (gcc on Linux / Cygwin, OSX's clang). This leads to some version hell, so I'd love it if the IDE could take responsibility for finding me a compiler and feeding it the right flags for the local g++ candidate.
The library itself needs to be copied to the Arduino project directory, or symlinked, if it's not there already. That limitation is annoying but not the end of the world... it's just quirky.

PaulStoffregen commented 5 years ago

Can someone explain to me why using real hardware is considered so infeasible, but all this stuff is?

I mean, this is Arduino we're hypothetically talking about. A capable CI server machine, 2 or 3 of every Arduino product, a selection of popular shields, a pile of USB hubs with power-switch-per-port, and maybe even test gear like USB logic analyzers just isn't that expensive, maybe a couple thousand dollars?

ianfixes commented 5 years ago

@PaulStoffregen it sounds like you're describing a system where the CI machine would upload a sketch onto a board, and then provide you some scripting capability to send inputs to the I/O pins of the board and verify some expected outputs.

The problem with that type of system is that it can't do unit tests; it can only provide an "end-to-end" or "integration" test capability. Libraries like Queue (mentioned above) don't produce any effects on the I/O pins, they just provide some helpful functions to the developer. (You could go down the path of mapping internal state of a library to some expression on pins, but not all boards have the same pins and you'd be unable to test libraries that use all the pins. So that's a dead end.)

End-to-end testing does interest me (especially via some kind of simulator, which would be more scalable), but this seems like the wrong thread to discuss it (i.e. I don't get the sense that it could be provided by adding code to this repo in particular).

per1234 commented 5 years ago

I think PaulStoffregen has a very interesting idea. The way I see it working is for Arduino to provide this as a "hardware-in-the-loop" testing service. They maintain a room full of every official board, with some way of monitoring output and providing input on every pin. An online API is provided to run the tests and check for the expected results. PIO Remote is somewhat related, but they're not providing hardware, just the infrastructure to interface with hardware remotely. This would require a significant investment on Arduino's part for initial setup plus ongoing maintenance (and I think PaulStoffregen is significantly underestimating). However, there is a case to be made that, even providing the service for free, it would end up paying off in the end by increasing the quality of the community written code that is such an important asset to Arduino. Of course they can use the system for their own testing as well.

As with ianfixes' arduino_ci, Adafruit's travis-ci-arduino, or my arduino-ci-script, I see an HIL testing service as being only indirectly related to the feature request made in this issue. It's something library authors would incorporate into their own CI builds. If they wanted to invest even more into the project, Arduino could sponsor someone to submit PRs to the most popular libraries, adding these tests to their CI builds. That could help influence more widespread use of the service.

Can someone explain to me why using real hardware is considered so infeasible

The only thing I consider infeasible is for HIL, unit, or compilation tests to be configured completely automatically. I've already explained why.

PaulStoffregen commented 5 years ago

Ok then, duly noted on the unit tests. Not sure how I would apply that to the couple dozen very old Arduino library I maintain (after they were abandoned by their original authors), plus the ones I've authored... but I'll be watching and looking for examples as this develops.

Also planning to start using @ladyada's CI script soon....

adafruit-adabot commented 5 years ago

yay! @PaulStoffregen if you can add teensy as an option, and do a PR, that would allow us to test our libs against teensyduino build, could catch lil compilation errors

ianfixes commented 5 years ago

@PaulStoffregen I'd be interested in taking a crack at one of the old libraries you mention. It would be a good way to double-check the features of my CI library. Which one do you think would give you the most trouble?

PaulStoffregen commented 5 years ago

OneWire would be my first choice. https://github.com/PaulStoffregen/OneWire

s-celles commented 5 years ago

I noticed that arduino-cli is written in Go. I wonder if acceptance of an integrated Arduino CI tool with unit tests won't be higher if it won't depend on a script language such as Ruby.

ianfixes commented 5 years ago

There's no magic to my use of ruby, aside from the fact that it enjoys good cross-platform support and does not require executing arbitrary shell commands that are read from a remote source (e.g. source <(curl -SLs https://raw.githubusercontent.com/mostly_harmless.sh))

That type of installation process makes me very uncomfortable; at worst, it's a security risk; at best, there is no guarantee that two successive runs will use the same version of the script (which is an impediment to reproducible results).

Aside from providing semver-based packaging and dependency management (which comes with using a ruby gem), all that the ruby code does is handle the interactions with shell commands to run the arduino executable, the compilers, and the unit test binaries (and provide nice-looking output). Any object-oriented language will do.

ianfixes commented 5 years ago

OneWire would be my first choice. https://github.com/PaulStoffregen/OneWire

@PaulStoffregen this work is done (well, 2 months ago now), I'm curious to hear your thoughts https://github.com/PaulStoffregen/OneWire/pull/67

per1234 commented 3 years ago

There is, in my opinion, another small step forward in this effort today.

As I mentioned above, the Arduino Library Manager system already does some basic checks on the library metadata, structure, and contents and rejects any releases that don't meet the requirements. These requirements are focused on the absolute essentials for a library to be compatible with the Library Manager system and the Arduino development software.

There are opportunities for additional universally applicable checks for things that are not absolutely essential at the moment, but that can provide a better experience for users (best practices) and less chance of breakage in the future (specification compliance). Arduino Lint provides rules spanning this entire spectrum. As of today, Arduino Lint is used as the validation system for Arduino Library Manager submissions and indexing releases. This tool provides the same hard requirements as always, which will cause a library to fail validation and be rejected. But it also runs the suite of additional rules on the libraries, treating violations as a warning. These warnings are presented to the library maintainer:

During the submission process as comments from the bot
During the indexing process as logs published on a dedicated webpage for each library

My hope is that these warnings will serve to encourage conscientious library maintainers to work beyond the bare minimum requirements and toward best practices.

Even though we are limited in what checks can be done in a completely automated fashion universally, even small improvements across thousands of libraries can be very significant.

ianfixes commented 1 year ago

I'd be interested to hear another round of comments on this now that it's 2023 and a lot of advancements have been made in the arduino-cli backend.

arduino / Arduino

[Library] Automate quality assurance testing of libraries #7567

Enabling automated compilation tests (on sketches)

Enabling automated unit tests (on libraries)