[software/gpsd] Port improvements from software/co2

abesto commented 2 years ago

A lot of linters (and code changes to make the linters happy)
check.sh to run the linters and report all failures
GH Actions workflow to run check.sh
Poetry for dependency management
Shrink Docker image using multi-stage build (648MB -> 479MB)
- Switched the co2 image to use vim-tiny instead of vim while we're at it, which should save another ~20MB there
.gitignore

A note on the cryptography Python package: this is a transitive dependency of balena-sdk. Recent versions don't support the ENV var override used originally in this Dockerfile.template that avoided building Rust code (this Rust code is used by cryptography in code reimplementing non-cryptographic functions of libssl in a memory-safe manner). Unfortunately cryptography doesn't publish wheels for ARM, which means we kinda need to build from source. This PR adds the build-time dependencies for that (but only to the build container), and does the build of cryptography from source. This takes long, but I don't think there's any alternative (that won't break / won't become vulnerable as an increasingly outdated version of cryptography is used). On the bright side, the Docker cache of Balena (as well as local devenvs) will be hit as long as pyproject.toml and poetry.lock remain unchanged, so we should not have to wait for the build many times.

Also, minor code changes:

Use logging (for timestamps and better exception reporting)
subprocess.run / subprocess.check_call for cleaner subprocess handling

Testing status:

Docker build passes locally
gpsd.py fails locally saying it can't find a GPS device (because I don't have a GPS device)
I did not test the happy path (for lack of hardware), tread with caution

keenanjohnson commented 2 years ago

Thanks for these changes @abesto! I'll deploy to some hardware now to test. I'm planning on building two sensors (a raspberry pi and beaglebone sensor) this week as a proper test fleet.

djgood commented 2 years ago

What's the motivation for ditching the f-strings? Do we really care about supporting older Python versions?

Also, it's fairly straightforward (to me!) but helping out new folks by adding a quick section to the README.md somewhere for running the linter script might be helpful. Faster feedback than seeing it fail in the PR. Along the same lines, some quick notes on how to modify the Poetry dependencies might also be useful for future maintainers.

One more thing, can we keep a single check.sh for both files? Maybe make a new software/util directory and store it there? Easier to maintain one script than one for each subproject in my opinion.

abesto commented 2 years ago

What's the motivation for ditching the f-strings?

That we don't serialize things to strings unless the given loglevel is enabled. FWIW, pylint complains about f-strings inside logging functions for this reason.

README.md

Good call! I'll amend the PR tomorrow.

One more thing, can we keep a single check.sh for both files?

That sounds sensible, though I'm worried that this introduces coupling that may be broken later. What if one of the modules gets unit tests, but not the other? Two instances seems not too much. Of course they can always be separated later if needed...

keenanjohnson commented 2 years ago

Thanks for the help reviewing @djgood ! @abesto sorry I was a bit ill the past few days, but I'll take a look at this after your changes tomorrow.

djgood commented 2 years ago

That we don't serialize things to strings unless the given loglevel is enabled. FWIW, pylint complains about f-strings inside logging functions for this reason.

I hadn't considered the performance implications, I know f-strings are pretty quick but it's true that the code is wasted if the logging level isn't enabled. Interesting Github issue on the subject: https://github.com/PyCQA/pylint/issues/2395. Basically showing only a few percent difference in performance for interpolating strings in unused log levels. And there's an easy way to disable the f-string warning.

Personally, I think the f-string format is generally more readable and since we aren't logging too often (most of these logs run on startup of the code) I don't think the performance increase is worth it. But I'll definitely admit it's an opinion based thing so maybe we can leave the decision up to @keenanjohnson.

That sounds sensible, though I'm worried that this introduces coupling that may be broken later. What if one of the modules gets unit tests, but not the other?

That's a good point. To get around that problem check.sh could check if the tests/ (for example) directory exists. I think the coupling might actual be handy so that each subproject is structured in a similar way. This is another one of those somewhat opinionated things that I will also leave up to @keenanjohnson.

Otherwise, I forgot to say: Nice work! Thanks for your contributions. Code lint checks will be super helpful for keeping our code pretty!

abesto commented 2 years ago

f-strings

Yeah it's one of those things that's not really worth thinking about, except maybe to be consistent (unless there's logging in hot code paths, which this isn't). I tend to default to not using them in logging functions, but no strong opinion either way.

check.sh

Agree there's value in enforcing uniform structure across subprojects. I don't know that this is the right tool for it, but again, no strong opinion here.

Otherwise, I forgot to say: Nice work

<3 :)

Code lint checks will be super helpful for keeping our code pretty!

And hopefully, more correct as well! (Typesheds for those libraries would be nice, but I'm not about to go down that particular rabbit hole).

keenanjohnson commented 2 years ago

This all looks good to me! @djgood do you feel good here?

Thanks @abesto for all the work!

djgood commented 2 years ago

@keenanjohnson Yup good with me!

keenanjohnson commented 2 years ago

Hey @abesto! Running the linter manually on the main branch post-merge produced a few failures: https://github.com/Ribbit-Network/ribbit-network-frog-sensor/runs/4875490534?check_suite_focus=true

keenanjohnson commented 2 years ago

Seems like an issue with mypy, any idea on the best practice for how to fix that?

abesto commented 2 years ago

Huh, weird that co2 would start failing. Also, the error seems weird: I checked that we're referencing the right gpsd module here. The functions do seem to exist: https://github.com/MartijnBraam/gpsd-py3/blob/0.3.0/gpsd/__init__.py#L246. Lemme try to reproduce this locally.

abesto commented 2 years ago

Lol, OK so what's happening is: I added an __init__.py to our gpsd folder to tell Python tools that We're a Proper Python Project. Unfortunately this means that when co2 tries to look up the gpsd module, then instead of the one installed from the gpsd-py3 package, it finds software/gpsd as it walks up the directory tree. So indeed mypy is correct, and co2 is currently broken when run from the source tree including gpsd because of this (which shouldn't be an issue in production since we ship them to separate containers).

Ideally we'd rename software/gpsd to something more descriptive and avoid the name collision that way, but for a quick fix I think it should probably be safe to remove __init__.py from gpsd (and possibly co2 as well for consistency?)

keenanjohnson commented 2 years ago

Interesting and thanks for the catch. I think it's fine to rename the folder to just GPS for now.

Resolved in https://github.com/Ribbit-Network/ribbit-network-frog-sensor/pull/100

Ribbit-Network / ribbit-network-frog-hardware

[software/gpsd] Port improvements from software/co2 #99