microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
164.98k stars 29.53k forks source link

RFC: (Engineering) Introduce A Reflectable Deserialization Helper #133723

Open hediet opened 3 years ago

hediet commented 3 years ago

Current State

We have several places where we validate/deserialize JSON. Validation can fail with error messages, defaults need to be inserted and a JSON schema is also usually provided.

Here are some examples of how we do it currently:

Problem

The current style of doing JSON deserialization/validation is very verbose and inconsistent. There is no guarantee that the JSON-Schema matches the validation (though sometimes this is done on purpose).

Proposed Solution

I suggest to explore introducing a generic, descriptive deserialization helper that allows to specify the typescript type, the validation logic, the default value and human readable descriptions at a single place. The JSON schema can be derived from that.

The outcome should be an internal helper (not an external library) that is similar to io-ts or my own library hediet/semantic-json (example).

In case of editor.minimap.side, the code should look like this:

const editorMinimap = objectDeserializer({
    properties: {
        ...
        side: property(enumDeserializer(['left', 'right']), {
            defaultValue: 'left',
            description: nls.localize('minimap.side', "Controls the side where to render the minimap.")
        })
        ...
    }
});

const editorSettings = objectDeserializer({
    properties: {
        ...
        minimap: editorMinimap
        ...
    }
});

assert.deepStrictEqual(flatten(objectDeserializer({ properties: { editor: editorSettings } })).getJSONSchema(), {
    ...
    "editor.minimap.side": {
        type: 'string',
        enum: ['left', 'right'],
        default: 'left',
        description: "Controls the side where to render the minimap."
        ...
    }
    ...
});

People interested in this might be: @alexdima @connor4312 @aeschli

I'm open to explore this idea further if there is interest in this.

connor4312 commented 3 years ago

I like this idea. I think @hediet/semantic-json, which I used quite a bit in my ssh work earlier this iteration, fits the need pretty well--could we not use it as an external rather than internal library? There's obviously some things that need to be expanded, but this seems like something that can be reused.

io-ts seems much bulkier, I don't really like its API at first glance.

hediet commented 3 years ago

could we not use it as an external rather than internal library? There's obviously some things that need to be expanded, but this seems like something that can be reused.

I would recommend copying over the design of @hediet/semantic-json and adapting it to the specific needs of VS Code. I'm not feeling very comfortable using personal libraries at work ;)

I like that VS Code does not use many libraries and that we can easily change utilities.

io-ts seems much bulkier, I don't really like its API at first glance.

That's why I started with semantic-json. io-ts types are also not reflectable at runtime, which is required for json-schema generation.

hediet commented 3 years ago

@dbaeumer pointed out something important: How does the type that TypeScript generates in d.ts file look like?

There are two approaches.

Approach 1: Define Deserializer, Infer TypeScript Type

(as used by hediet/semantic-json and io-ts)


export const minimapSettingsDeserializer1 = objectDeserializer1({
    properties: {
        /**
         * The side.
        */
        side: prop(enumDeserializer('right', 'left'), { defaultValue: 'left' })
    }
});
export type MinimapSettings1 = typeof minimapSettingsDeserializer1.T;

Generated types:

export declare const minimapSettingsDeserializer1: Deserializer<{
    side: string;
}>;
export declare type MinimapSettings1 = typeof minimapSettingsDeserializer1.T;

Approach 2: Explicitly Define The TypeScript Type, Enforce Matching Deserializer

export interface MinimapSettings2 {
    /**
     * The side.
    */
    side: 'right' | 'left';
}

export const minimapSettingsDeserializer2 = objectDeserializer2<MinimapSettings2>({
    properties: {
        /**
         * The side.
        */
        side: prop(enumDeserializer('right', 'left'), { defaultValue: 'left' })
    }
});

Generated types:

export interface MinimapSettings2 {
    /**
     * The side.
    */
    side: 'right' | 'left';
}
export declare const minimapSettingsDeserializer2: Deserializer<MinimapSettings2>;

TypeScript Playground link

I think due to missing JS Docs (which we need for editor options), only approach 2 is viable.

dbaeumer commented 3 years ago

I use approach 2 as well when generating validators for LSIF. See https://github.com/microsoft/lsif-node/blob/fd6a872b17da1fc05029271445c76aae70e35fc6/protocol/src/protocol.ts#L171-L171

alexdima commented 3 years ago

In general, I think we should explore code generation. All our settings could be described in one or more JSON files (possibly using JSON schema or JSON schema++) and then type definitions (with nice JSDoc comments) or efficient type validation code could be generated. But this needs to be discussed in the team and agreed team-wide.

From the editor options point of view, I am very interested in seeing a clear proposal that addresses the following:

connor4312 commented 3 years ago

If we go with a code generation approach, which would solve the .d.ts problem, I would vote to simply use JSON schema. I've actually done this when building service APIs before and like it a lot -- use the JSON schema as the point of truth and way to validate input, and derive types from it. Plus, there's lots of existing tooling around JSON schemas.

Also, while this design focuses on settings, something adjacent to this and possibly solvable with the same toolchain is extension host communication. Currently communication to and from the extension host is done mostly though JSON serialization with small bits of special handling. As I referenced in https://github.com/microsoft/vscode/pull/133200, the abstractions we currently have here are leaky and there can be benefits, even without straying from JSON, to knowing the shape of serialized data ahead of time. If we use a system that provides runtime information for the protocol types, there's opportunity for improvement.