NebraLtd / hm-diag

Helium Miner Diagnostics
https://nebra.io/hnt
MIT License
21 stars 23 forks source link

Frequency check to check for device tag as well as FREQ env var #444

Open shawaj opened 1 year ago

shawaj commented 1 year ago

Having spoken to balena, they have informed us that there is no easy way to add device variables in bulk - either to the preloaded image or via the dashboard.

They have suggested we use tags instead:

Hi,

Thanks for the precision. May I ask if you could consider using Tags instead of environment variable to pursue this tracking? If you enable io.balena.features.supervisor-api label in your container, you can access the list of tags from your device using the Supervisor API (for instance using the following code with curl): bash $ curl "$BALENA_SUPERVISOR_ADDRESS/v2/device/tags?apikey=$BALENA_SUPERVISOR_API_KEY"

The main advantage here is that tags are easy to manage in bulk from the dashboard and do not restart the device when updated. You will also be able to filter by tag name or value from the dashboard which seems more appropriate for your use-case.

From what I know, there is no easy way to generically add tags or environment variable to preloaded device image (i.e. images you can write on multiple devices at once). You can however you can pre-register each device and generate specific configured images for each of them. With this process you can tag the device during registration. E.g.:

# configuration
FLEET=YOUR_FLEET_NAME
UUID=$(openssl rand -hex 16)
VERSION=2.107.6
IMAGE=your_base_os_image.img
TAG=YourTrackingTag

# Register the device
balena device register $(FLEET) --uuid $(UUID)
# Set its tag
balena tag set $(TAG) --device $(UUID) 

# Generate the configuration file
balena config generate --version $(VERSION) --device $(UUID) --output config.json

# Inject the configuration file into the image
balena os configure $(IMAGE) --config config.json --device $(UUID)

## Your image should be ready to be written to the device

You can still add a set -e if you run this script in bash for extra safety (it will make the whole script fail if any error occurs in the process (e.g. if the generated UUID already exists).

If tags are not an option, the same solution can be used to set Environment variables on device at provisioning.

Let us know if these options are suitable for you needs,

Thanks,

Aurélien

I then replied with:

Hi again,

Currently we just pull the FREQ environment variable - https://github.com/NebraLtd/hm-diag/blob/a467c93840d9deabccf1bb5f3451219eca0fd448/hw_diag/utilities/shell.py#L14

And then we use this key in various places, for example here to show the frequency on a web page: https://github.com/NebraLtd/hm-diag/blob/48a8276313e0c2ce4027e47dd6aff268a71dd2fc/hw_diag/templates/diagnostics_page_light_miner.html#L95

Using your method of:

curl "$BALENA_SUPERVISOR_ADDRESS/v2/device/tags?apikey=$BALENA_SUPERVISOR_API_KEY"

What would this output? And what would be the best way to translate this to our code?

@NebraLtd/developers what are your thoughts on this?

An alternative / additional check could be reading the FREQ key out of nebra.json file? @kashifpk does this exist now?

shawaj commented 1 year ago

@KevinWassermann94 this one is in relation to merging 470, 868 and 915 units into a single fleet

KevinWassermann94 commented 1 year ago

@shawaj I'll discuss this with the team tomorrow @kashifpk @MuratUrsavas @pritamghanghas Please look into this for grooming tomorrow

shawaj commented 1 year ago

FYI I've already bulk tagged all the fleets with tags of their frequency.

We could also run a script to add the environment variables individually to each device on a device variable level.

Or maybe do both as a "belt and braces" type of approach

KevinWassermann94 commented 1 year ago

@MuratUrsavas Will add more details in a moment, but we figured tags are not an option for us. They require pre-registering devices, which would completetly break our manufacturing process, as it would create individual images for every single unit which is unacceptable.

We have a potential solution that would add the env var to the docker compose file. Murat will add more details

shawaj commented 1 year ago

Interested to hear the docker compose idea. But don't see how that will work when the devices are all in a single fleet.

The other thing could be to just start them in manufacturing in a frequency based fleet and then tag them and move them to the main fleet automatically.

The other thought I had was using the nebra.json and just pulling the frequency into the environment by reading that using a helper method in hm-pyhelper?

KevinWassermann94 commented 1 year ago

Iirc we plan to use an env var in the docker compose. We only need the frequency at manufacturing and don't need it to be persistent. In theory we could even use the dashboard to call the Balena API to set the env var on device level

If we had them in seperate fleets during manufacturing those probably needed to be OpenFleets as well. That wouldn't really help us, as for a standard fleet they would be considered for billing afaik

We are unsure about the state of nebra.json and there were some complications in actually injecting the frequency into it. It's definetly merged, but there were some complexities

shawaj commented 1 year ago

It would be useful to have the frequency persist if possible. But presumably with the docker-compose way, it would persist in our dashboard for our own devices?

If this is the case, I think that's sufficient. As you so we can always tag in Balena as well if we want to using that data

MuratUrsavas commented 1 year ago

For a moment I thought that Docker Compose is a good idea, but the main problem is, the updates. For manufacturing, we can create an individual file per frequency-variant and have that device carry it. But after putting them in a single fleet and update it, what will the docker compose will carry? It will eventually update all the devices with the same info, hence that's out of option.

I'm into a simple file, like: /var/variant_freq. It's like the /var/pktfwd/region file and will be persistent. We'll be injecting them into production images (without disturbing any of our production workflows) and that's it. All of our docker containers will read that file and create FREQ env var. No other change will be necessary to our software stack, because all of them are looking for a FREQ environment variable.

How about that @shawaj @KevinWassermann94?

MuratUrsavas commented 1 year ago

Of course we have to remove all of the Balena fleet and device references, because I don't know which one has the priority. Probably would be the docker one, but it's better to get rid of them all at once.

MuratUrsavas commented 1 year ago

And I've found a way how to do that. :+1:

KevinWassermann94 commented 1 year ago

Sounds good to me!

shawaj commented 1 year ago

So we will just inject a different file per frequency?

MuratUrsavas commented 1 year ago

Yep, as the production images already carry that information. We'll have a file, like /var/prod/freq and it will contain just the frequency like 868 or 915.

shawaj commented 1 year ago

So essentially what we were trying to do with nebra.json, just a little bit simpler?

Nice 👌

MuratUrsavas commented 1 year ago

Yep. Not even trying to put some fancy curly braces :smile:

kashifpk commented 1 year ago

@MuratUrsavas - I'd recommend that we leave the option open to use this file for more than just frequency in the future. Perhaps we call it something like nebra.env? So in future it might have more env variables?

@KevinWassermann94 @shawaj - Frequency also gets stored with the device at dashboard during registration. So even if we lose it from the device we have a way to read it from dashboard and set as device variable if we want. Perhaps upon registration we can create a celery task to set the frequency using balena API as a device variable?

MuratUrsavas commented 1 year ago

I'm seeing a pattern here :smile:

Let's do that an ini file. => Hey why not a YAML? => Let's make it JSON as it's better. => nebra.json rises again from its ashes :D

MuratUrsavas commented 1 year ago

I'd recommend to keep it simple, and make it more complex if ever needed.

KevinWassermann94 commented 1 year ago

@kashifpk I agree, we might as well set an env var in Balena during registration in the manufacturing process

MuratUrsavas commented 1 year ago

And I've found a way to keep it simple, yet flexible what @kashifpk wants.

Will create var/nebra/env folder and put files in it. In this case it will be FREQ and will contain like 868.

Yea, you got it right. More files, more environment variables. file name is the ENV variable, contents is the value. This way we can do simple operations even in bash.

kashifpk commented 1 year ago

@MuratUrsavas - if we can keep it a single file with multiple env variables (one one called FREQ for now) that would be much more compatible with existing libraries.

File contents for: /var/nebra/env

FREQ=868

as these can be readily read by most languages. For hm-diag python's dotenv can just read this and set as env variables.

MuratUrsavas commented 1 year ago

I was thinking in plain bash, without needing any kind of language, including Python. If that need would arise later, we can use var/nebra/json and do fancy stuff. What do you think?

MuratUrsavas commented 1 year ago

Even integrate current nebra.json into that file.

kashifpk commented 1 year ago

The key=value format works for plain bash too :-)

source /var/nebra/env

MuratUrsavas commented 1 year ago

Yep, I'm convinced :white_flag: :smile: Will do that so.

shawaj commented 1 year ago

@MuratUrsavas @kashifpk nice. This is really handy and allows us to persist the env variable in both Balena and dashboard.

Perfect 🥰

shawaj commented 1 year ago

@MuratUrsavas @kashifpk for existing devices, if I merge into a single fleet, will this cause any issues right now. Or we should wait until the celery task thing is implemented?

Already they have tags in Balena but not device environment variables

MuratUrsavas commented 1 year ago

@shawaj Please don't move devices into a single fleet right now. The problem is not FREQ, it's VARIANT. Without it, the devices will start to fail. So we need two things to make this move happen:

  1. The built-in ENV variable coming from production/image has to be implemented (this issue)
  2. A way to create per device ENV variable on Balena if that info doesn't exist. (A follow-up issue)

The first one will make all devices work with correct setup in single fleet. Second one will make sure the device would work after a purge.

shawaj commented 1 year ago

@MuratUrsavas sorry for confusion on my side.

I didn't mean a single fleet. I meant:

So only thing removed would be the frequency

MuratUrsavas commented 1 year ago

@shawaj No worries :smile:

It's OK then, but do we really need it? Because every fleet move has a device loss potential, and the rate is not negligible. Maybe we should do that after implementing those two things?

Otherwise, it's OK. I'm not expecting a functionality loss.

shawaj commented 1 year ago

It was more for ease of support for the open fleets. As then it's not frequency specific and they can just download a new SD card image as and when they need

kashifpk commented 1 year ago

@MuratUrsavas @kashifpk for existing devices, if I merge into a single fleet, will this cause any issues right now. Or we should wait until the celery task thing is implemented?

We should wait at the very least for a task to set device environment variables based on the frequency we have in the database. We can then run that task after moving to the new fleet. So if there are issues during moving to the new fleet we can at least set the env variables afterwards.

shawaj commented 1 year ago

Good idea @kashifpk makes sense to me

MuratUrsavas commented 1 year ago

I've good news. I was able to put some information in the data partition of a new and never booted image (like our production images) and track it through the first run and partition expansion. Data stays intact and outlives an OS update (tested). So we're good to put anything we want.

Just need to find a good place to mount it as a named volume in our containers.

I'm also planning to put nebra.json in here as it will get lost with an OS update in current position.

shawaj commented 1 year ago

@MuratUrsavas so this is done by creating a new volume? or it uses some pre-existing balena volume?

Sounds great though!

MuratUrsavas commented 1 year ago

@shawaj We'll be creating a new named volume folder under exact location with our other named volumes. That'll be automatically accessible by our containers.

It would be a little bit easier with bind mount, but Balena doesn't allow bind mounts. That's why we'll be doing that with named volumes, even it is adding little bit of complexity.

I was going to finish this task today, but Balena-Staging started acting weird and now refuses to build new updates. Need to wait for start of the week.

shawaj commented 1 year ago

@shawaj We'll be creating a new named volume folder under exact location with our other named volumes. That'll be automatically accessible by our containers.

It would be a little bit easier with bind mount, but Balena doesn't allow bind mounts. That's why we'll be doing that with named volumes, even it is adding little bit of complexity.

I was going to finish this task today, but Balena-Staging started acting weird and now refuses to build new updates. Need to wait for start of the week.

@MuratUrsavas what I meant was how do we create it in the production images?

Also, you can probably just test with one of the test fleets in the main Balena dashboard instead of on their staging dashboard? As staging is really only needed for non-released boards like Radxa Zero and Rock CM3

MuratUrsavas commented 1 year ago

what I meant was how do we create it in the production images?

@shawaj It will be injected at the same time of config.txt but of course in another partition. That'll be in the hotspot-production-images repo.

Also, you can probably just test with one of the test fleets in the main Balena dashboard instead of on their staging dashboard?

Yea, I know that. But it was easier for me to test both Light Hotspots and this task. Guilty as charged :smile:

shawaj commented 1 year ago

Haha fair enough 🤪

And yeah that's great on the file injection - nice work. So we will essentially take the Balena base image and preload it as we do now but we will add an extra data partition to that image where we will put these files?

(Also just FYI https://github.com/NebraLtd/hotspot-production-images/issues/57)

MuratUrsavas commented 1 year ago

So we will essentially take the Balena base image and preload it as we do now but we will add an extra data partition to that image where we will put these files?

@shawaj Most of it is correct, we just don't add a new partition. It's already there, but we were not mounting it. It's basically empty and the file structure is getting created in the first run. So we'll be creating the same folder structure before the first run. That was the missing information I was after. Couldn't get much info from Balena but I've found my way with poking some patterns into the file system :smile:

shawaj commented 1 year ago

@KevinWassermann94 @MuratUrsavas is this implemented now? Can we close this?