TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
980 stars 309 forks source link

End device onboarding flow #4847

Closed johanstokking closed 2 years ago

johanstokking commented 2 years ago

Summary

New end device onboarding flow

Replaces #3770 Blocked by #4840 Blocked by #4841 Blocked by #4845

Why do we need this?

To make it even easier to onboard new end devices by scanning QR codes.

Also we need to remove the creation on an external Join Server and integrate device claiming in the onboarding process.

For most end users, device creation and claiming is conceptually the same. We should put this in one nice device onboarding experience.

What is already there? What do you see now?

  1. Device creation via Device Repository and manual
  2. Device claiming by importing a manifest
  3. QR code scanning app (TTSE only)

What is missing? What do you want to see?

Onboarding flow with QR code scanning, claiming, manual creation and retrieving info from the Device Repository integrated.

How do you propose to implement this?

For onboarding:

We have the following properties in an end device onboarding state:

Allow user to choose between: scan QR code, choose from device repository and manual creation:

  1. Ask the user whether they want to scan a QR code. If yes, go to step 1, if not, go to step 2.
  2. Scan QR code: this contains at least the JoinEUI and DevEUI, potentially also the numeric Vendor and Profile ID and the Claim Authentication Code. Call the QRCodeParser.Parse() rpc (https://github.com/TheThingsNetwork/lorawan-stack/issues/4845)
    1. Put the JoinEUI and DevEUI in the onboarding state
    2. If the QR code contains a claim authentication code, put this in the onboarding state
    3. If the QR code contains a non-zero Vendor ID and Profile ID, lookup the LoRaWAN device profile (#4842) and put this in the onboarding state. Otherwise, the LoRaWAN device profile is empty, but the activation mode can be preset to OTAA (ABP devices won't have QR codes)
    4. Go to step 2
  3. Choose between selecting from the Device Repository or manual creation
    1. Choose from Device Repository. This gives us the Brand and Model ID, Hardware and Firmware Version and Region
      1. Get the LoRaWAN device profile from the Device Repository and put this in the onboarding state
    2. Manual creation, this basically proceeds to step 3 with an empty LoRaWAN device profile
  4. Show all known information: activation mode, JoinEUI, DevEUI and fill out LoRaWAN device profile
    1. If there is no information (the user came here through Manual creation), the user has to select the activation mode still
    2. If the activation mode is OTAA
      1. The JoinEUI can be prefilled from step 1, entered manually or the user can take the default JoinEUI (#4840) (no more 00!)
      2. As soon as any JoinEUI is filled in, contact DCS to see if claiming is supported for that JoinEUI
        1. If claiming is supported, prefill the claim authentication code from step 1. If step 1 was skipped, the claim authentication code is empty and can be entered by the user. In either case, do not ask for the root keys
        2. If claiming is not supported, do ask for the root keys
    3. Show the rest of the form like we do now; LoRaWAN versions, frequency plan, ABP settings etc etc
  5. Create does the following:
    1. Create on IS: this is important to make sure the end device identifiers are unique
    2. Create on NS and AS: mostly for validation
    3. Is there claim authentication code in the onboarding state?
      1. If yes, claim the end device on DCS. Do not set the join_server_address
      2. If not, create the end device on the cluster-local JS with the root keys, do set the join_server_address
    4. Any failure leads to a rollback

For offboarding

Check if the join_server_address is set

  1. If set, delete from JS
  2. If unset, unclaim the end device from DCS

How do you propose to test this?

Let's test the flows first. I'm not sure what the best of doing that is; using mock ups?

These are key scenarios we need to support:

  1. Generic Node with QR code. Contains claim authentication code. JoinEUI, DevEUI and brand are known. The JoinEUI supports claiming. The user only needs to select Generic Node as device model, the versions, region and frequency plan
  2. Generic Node without QR code, onboarding by entering JoinEUI and DevEUI. Select brand, model, versions and region from Device Repository. This should detect that claiming is supported and should ask for a claim authentication code.
  3. Any device from the Device Repository that does not support claiming; this should ask for root keys and only allow creating the device in the cluster-JS
  4. Manual creation of OTAA, ABP and multicast device should still work as expected

Can you do this yourself and submit a Pull Request?

Can review

KrishnaIyer commented 2 years ago

Now that https://github.com/TheThingsNetwork/lorawan-stack/pull/5324 is merged, here's a short summary of the backend.

Claiming/Unclaiming (Primary flow)

Getting Identifiers from a QR Code

kschiffer commented 2 years ago

So following our meeting just now, we figured out that it is not actually possible to determine the device model, versions and region from the QR code scan since it will only give us the device profile info and brand ID, which can be valid for multiple combinations.

So there are two things to do here:

  1. Look into extending the information that can be obtained from the QR code to include some kind of device repository identifier @johanstokking 
  2. Change the UX to still have users select the relevant model, versions, and frequency profile when scanning a QR code. For this, the possible combinations would ideally be narrowed to the ones that match the fetched profile. In order to implement that, a new RPC in the DR service is required which would return such combinations based on the specified profile. @johanstokking @KrishnaIyer 

I will work on modifying the wireframes accordingly.

johanstokking commented 2 years ago
  • Look into extending the information that can be obtained from the QR code to include some kind of device repository identifier @johanstokking
  • Change the UX to still have users select the relevant model, versions, and frequency profile when scanning a QR code. For this, the possible combinations would ideally be narrowed to the ones that match the fetched profile. In order to implement that, a new RPC in the DR service is required which would return such combinations based on the specified profile. @johanstokking @KrishnaIyer

For background: currently, the QR code contains vendor ID and profile ID, and there's gonna be a codec ID. That might be useful, but that does not provide the version identifiers which is useful for stats and display.

So ideally, the QR code tells us not only the vendor ID, but the model ID, hardware and firmware version and band. We could still use a single identifier for that, but not "profile ID" and "codec ID". However, that identifier would replace the need for a profile ID and vendor ID.

Until we have that, don't bother with this. We should not attempt reverse lookups. It gets too complicated also considering we support referring to profiles of other vendors.

kschiffer commented 2 years ago

Alright then. I've finalized the wireframes so that we can now plan implementation.

See the clickdummy/wireframe

Please have a look and confirm.

I this still blocked on anything else? Otherwise we can remove the blocked label as well.

johanstokking commented 2 years ago

This look complete to me.

KrishnaIyer commented 2 years ago

ACK. Looks good to me as well. This isn't blocked so I'll remove that label.

kschiffer commented 2 years ago

Planning

Here's a rough planning summary of how we are going to implement this. I will keep this post updated with progressive insight.

Structure

Ok, so as discussed in our meeting last Wednesday we will split implementation up using the following scaffold

<EndDeviceOnboardingForm>
  <EndDeviceTypeFormSection>
    <DeviceTypeRepositoryFormSection />
    <DeviceTypeManualFormSection />
  </EndDeviceTypeFormSection>
  <EndDeviceProvisioningFormSection>
    <EndDeviceRegistrationFormSection />
    <EndDeviceClaimingFormSection />
  </EndDeviceProvisioningFormSection>
</EndDeviceOnboardingForm>

Here's the scaffold applied to the wireframe:

image

Implementation details

We will use React's context API to store global form data within <EndDeviceOnboardingForm />, which then all sub-components will subscribe to. This way we avoid passing down a lot of props to the child components and rather use <EndDeviceOnboardingForm /> as a single source of truth about global form state. We will use two contexts here:

  1. Formik Context (can be obtained by useFormikContext()-hook)
  2. A custom context, will be created using React's context API and will contain global form configuration, e.g. whether EUI generation is allowed, whether server components are disabled, etc)

The form is highly dynamic but there are certain self-containing sections of the form, which I have outlined above which each should be able to handle their own concerns while reacting to certain form field values stored in the global context.

E.g. <EndDeviceTypeRepository /> will handle the Input method field and render either <DeviceTypeRepositoryFormSection /> or <DeviceTypeManualFormSection /> based on the user's selection. It will do so by invoking the context via useFormikContext() which contains the current form values, allowing us to do the conditional rendering.

Reuse of existing code/components

Generally, the new form contains many existing elements and we should reuse as much as possible, this includes validation schemas, general utilities, JSX markup, etc. We should however make sure to address some of the issues we have in the current code:

Other things to note

Where to go from here

I will create a feature branch with a scaffold of the implementation and then we can delegate tasks and work on this in parallel.