covidwatchorg / tcn-node

Node.js Module that Wraps TCN Protocol Rust Implementation
Apache License 2.0
4 stars 1 forks source link

Add types / improve API #7

Open miridius opened 4 years ago

miridius commented 4 years ago

Currently tcn-node (v0.3) exposes only 1 useful function, validateReport, which is specified as type any => any.

I have added full typescript support in a new branch and am working on specifying the types for this function, it would be good to know what is convenient/expected by outside consumers. It would also be good to know what other useful methods we might add to our API.

So far I have this:

export interface Report {
  rvk: number[];
  tck_bytes: number[];
  j_1: number;
  j_2: number;
  memo_type: "CoEpiV1" | "CovidWatchV1" | "Reserved";
  memo_data: number[];
}

export interface Signature {
  R_bytes: number[];
  s_bytes: number[];
}

export interface SignedReport {
  report: Report;
  sig: Signature;
}

And validateReport therefore has type (signedReport: SignedReport) => boolean

But, all of those number[] typed fields represent byte arrays, so it is arguably better to represent them as Uint8Arrays. Base64 encoded or BAIS encoded strings could also work - but maybe better to have that as a separate function or expose some conversion/report building methods for convenience.

Also, the naming convention and structure could for sure be improved, currently the fields are very non descriptive. I'm open to suggestions for how we should call them. Another useful step would be to combine the signature into a single field instead of splitting on R_bytes and s_bytes - the splitting is likely an implementation detail not needing to be exposed to end users - though I am not confident on that topic.

What do you think? @madhavajay @inmyth

inmyth commented 4 years ago

I think the byte array fields are already good. In the backend the payload is stored to Firebase as byte arrays anyway so it still had to do the conversion.
Should we pass signature as one single array ? I guess yes.

One thing I'd like to mention is MemoType enum. Right now the request contains only its code. To get the enum string I think we're going to need custom method that covers all the cases in the code and in the documentation:

// code
CoEpiV1 = 0,    /// The CovidWatch test data format, version 1 (TBD)
CovidWatchV1 = 1,  /// Reserved for future use.
Reserved = 0xff,
// doc, which has all the above plus these
0x2: ito report v1;
0x3-0xfe: reserved for allocations to applications on request;

The method would be something like this:

def getMemoType(code: int) : String = {
  switch(code) {
    case 0: return MemoType.CoEpiV1.toString();
    case 2: return "itoreportv1"; // I guess anything until it's no longer WIP
    ...
    default: return "Invalid";
  }
}
miridius commented 4 years ago

If you rather just pass the numerical code that's no problem. What we can try to do is make it a union type where you can send either the code or the enum string, but I'm not sure how hard that is to implement. Taking the idea further, we could maybe do something like this:

// byte array or base64 encoded string
export type BinaryData = Uint8Array | number[] | string;

export type MemoType = "CoEpiV1" | "CovidWatchV1" | "Reserved";

export interface SignedReport {
  reportVerificationKey: BinaryData; // exactly 32 bytes
  temporaryContactKey: BinaryData; // exactly 32 bytes
  tcnStartIndex: number;
  tcnEndIndex: number;
  memoType: number | MemoType; // either numerical code or enum value
  memoData: BinaryData; // 0-255 bytes
  signature: BinaryData; // exactly 64 bytes
}

What might also make sense is having a Report class where we can add methods like verify() to the report objects. So you could write new Report({ ... }).verify(). Then we could have methods for getting the fields in different format e.g. byte array or base64 encoded

miridius commented 4 years ago

Actually a number | string union type for MemoType is silly, when we can just use an enum:

export enum MemoType {
  CoEpiV1 = 0,
  CovidWatchV1 = 1,
  Reserved = 0xff,
}
madhavajay commented 4 years ago

Although the spec allows for more values right so probably better to just let anything in the range. Can a typescript enum have an unknown case with no mapped int?, So it can pattern match to any of those 3 valid in spec values, unknown for within 2-254?

inmyth commented 4 years ago

I think Report represents data that is passed around in the application so it probably doesn't need signature. We only need signature for verification so once it's done we just throw it away. Also there's possibility that some new fields are added to Report in the future. Having all its fields expanded in SignedReport would make it difficult to maintain.

So yes my concern was if Rust enum type can be converted to JS/TS but exporting the enum as-is would be the best option. Sure at the very worst the client side still has to do switch-case but the underlying rules still come from TCN. It would be even better if the TCN code is updated first to accommodate the full range. It's not a big change but I'm wondering if there's some consideration to not have it yet.

miridius commented 4 years ago

Point taken about keeping the signature out of the Report object, let's keep them separate. We can still provide a SignedReport interface as well to combine them when desired.

For the memo type enum, making it open ended sounds good. To achieve that I suggest we use a union type MemoType | number unless there's a better way.

I figured the best way to design the API is by writing documentation/examples for it first, so I mocked up the main use cases I could think of. We could provide both a functional style API as well as OO style without much extra work, but my concerns are a) over-complicating the docs, b) fragmenting the user base, and c) making user-land code less predictable and therefore less readable. It's probably better to just pick a canonical approach and stick to it, even if it won't be everyone's favourite choice it's better than having conflicts.

My personal preference is a functional style so that's what I present here:

Usage (proposed)

Creating your own report and checking its correctness

import * as tcn from "tcn-node";

// Generate a report authorization key.  This key represents the capability
// to publish a report about a collection of derived temporary contact numbers.
const reportAuthKey = tcn.createReportAuthorizationKey(randomSeed);

// Use the temporary contact key ratchet mechanism to compute a list
// of the first 100 temporary contact numbers.
let tcnIterator = tcn.newTcnIterator(reportAuthKey);
const tcns = Array.from({ length: 100 }, () => tcnIterator.next().value);

// Prepare a report about a subset of the temporary contact numbers.
const tcnStart = 20; // index of the first TCN to disclose
const tcnEnd = 90; // index of the last TCN to check (TODO: is this inclusive?)
const signedReport = tcn.createReport(
  reportAuthKey,
  MemoType.CoEpiV1,
  Buffer.from("memo data"),
  tcnStart,
  tcnEnd
);

// Check that the signature is valid
if (!isSignatureValid(signedReport) throw new Error("signature not valid!");

// recompute the disclosed TCNs and check that they match the originals
// tcnStart and tcnEnd are 1-indexed so we need to subtract 1 from each
let recomputedTcns = tcn.recomputeTcns(signedReport.report);
if (recomputedTcns !== tcns.slice(tcnStart - 1, tcnEnd - 1))
  throw new Error("TCNs do not match!");

Verifying a provided report / recomputing TCNs

import { isSignatureValid, recomputeTcns, verify } from "tcn-node";

// A report is any data structure containing at least the fields in the Report interface
const report = {
  reportVerificationKey: Buffer.from(
    "c9yuWatecES6EEypCi/4mATjrhDt2KEIjfPX6/uWTwE=",
    "base64"
  ),
  temporaryContactKey: Buffer.from(
    "A1uiq8JqVEmuYAo4Cff1xfFV1X2r77mtFhfe+kfHhwQ=",
    "base64"
  ),
  tcnStartIndex: 20,
  tcnEndIndex: 90,
  memoType: MemoType.CoEpiV1, // specifying the numerical code (e.g. 1) also works
  memoData: stringToBytes("symptom data"),
};

// A signature is an array of 64 bytes
const signature = Buffer.from(
  "SEGn3wEbae77ZT94lOdGFKlC7WC0uh2xc8NrTDoRi78Fh5Tgw7QP38+Ngopujpa6JsGh2niOi0Ro9zMzweNZAA==",
  "base64"
);

// cryptographic verification of the report + signature
if (isSignatureValid(report, signature)) {
  // Recompute the disclosed TCNs
  const tcns = recomputeTcns(report);
} else {
  throw new Error("signature not valid!");
}

// Calling recomputeTcns with a signature will verify it first. This is equivalent to the code above:
const tcns = recomputeTcns(report, signature);

// A SignedReport interface also exists which combines the report with the signature.
// Functions which normally accept a report and a signature will also accept a signedReport:
const signedReport = { report, signature };
const tcns = recomputeTcns(signedReport); // (throws an Error if signature is not valid)

// If you already know that the report is valid you can skip the signature verification
const tcns = recomputeTcns(report);

Type Reference

enum MemoType {
  CoEpiV1 = 0,
  CovidWatchV1 = 1,
  Reserved = 0xff,
}

interface Report {
  // Note: Buffer is a subclass of Uint8Array so it would also be accepted
  reportVerificationKey: Uint8Array; // exactly 32 bytes
  temporaryContactKey: Uint8Array; // exactly 32 bytes
  tcnStartIndex: number; // 16 bit unsigned int
  tcnEndIndex: number; // 16 bit unsigned int
  memoType: MemoType | number; // 8 bit unsigned int
  memoData: Uint8Array; // 0-255 bytes
}

interface SignedReport {
  report: Report,
  signature: Uint8Array, // exactly 64 bytes
}
miridius commented 4 years ago

@inmyth having a look at your code in the cloud functions project you are using the Buffer type to store binary data, which I think is a good idea. Buffer is a subtype of Uint8Array so the above typedefs would still work. Since Buffer.from() can be used to convert from strings we probably don't need the utility functions for conversion. I'll edit my comment above to reflect this

inmyth commented 4 years ago

Thanks but Buffer was something I copied from madhava's code. You both clearly know more about Node than me. I think the API draft already looks comprehensive. The essentials and examples are already there. Within the scope of core lib this has covered pretty much everything.