opral / inlang-message-sdk

0 stars 0 forks source link

lint rule API v2 #104

Open samuelstroschein opened 5 days ago

samuelstroschein commented 5 days ago

Context

@martin.lysk1 and @loris.sigrist have been working on a new API for lint rules that is a breaking change and will foreseeably get a breaking change again. Plus, the API will be slow (maybe fast enough for MESDK-137 though!)

Why does the current and future lint API lead to recurring breaking changes?

The APIs are designed around a data structure. Any change in the data structure leads to a breaking change in the lint API.

For example, the current API lints on the messages primitive. The new proposal lints on message bundles. The result is that the entire linting system breaks (and is thrown away…). The data structure leaks into everything a lint rule does. From the id, to execution functions, to scheduling.

// Pseudocode that illustrates that the data strucuture is hardcoded
// into lint rules and breaks now by switching to a different primitive
type LintRule = {
- type: "messageLintRule"
+ type: "messageBundleLintRule",
   run: (args: {
- message: Message
+ node: MessageBundle
  settings: ProjectSettings & ExternalSettings
  report: (args: {
- messageId: Message["id"]
+ bundleId: MessageBundle["id"]
   languageTag: LanguageTag
   body: MessageLintReport["body"]
  }) => void
 }) => MaybePromise
}

Why are the APIs slow?

Disclaimer: It might be fast enough to fulfill MESDK-137.

Passing the top level node (MessageBundles) to a lint rule entails that a lint rule:

  1. Has to re-execute if anything in a message bundle changed. Even if the lint rule only concerns itself with a variant. Ideally, a lint rule only re-executes if the variant of interest changes.
  2. The inlang SDK needs to execute all lint rules and use post-processing filtering (similar to ESLint) instead of pre-process filtering.
if (messageBundleChanged){
  // lint rule always re-executes if a bundle changed, 
  // regardless of the lint rule only concerning itself with 
  // variants (as an example)
  const reports = await lintRule.run(bundle)
  // because lints happen on the root node, the linter
  // must execute the entire tree and filter afterwards
  // (that is expensive!)
  const filtered = reports.filter(postlintFiltering)
}

Potential solution

Have one lint rule API with different functions for different data primitives.

Aka rename run to lintBundle and remove the hardcoded data primitive type from the id bundleLintRule.{namespace}.{key} -> lintRule.{namespace}.{key}

What we need today:

type LintRule = {
  id: string
} & BundleLintRule

type BundleLintRule = {
  lintBundle: ({ bundle, settings, report }) => any
}

const mylintRule: LintRule = {
  id: "lintRule.missingTranslation",
  lintBundle: ({ bundle, settings }) => {
    report()
  }
}

We can extend the lint rule function in the future (if the need arises!)

type LintRule = {
  id: string
-} & BundleLintRule 
+} & BundleLintRule | MessageLintRule | VariantLintRule | CodeLintRule | DesignLintRule

+ type MessageLintRule = {
  lintMessage: ({ message, settings, report }) => any
}

+ type VariantLintRule = {
  lintVariant: ({ variant, settings, report }) => any
}

+ type CodeLintRule = {
  lintCode: ({ sourceFile, settings, report }) => any
}
// a variant lint rule only get's re-executed if a variant changes
const mylintRule2: LintRule = {
  id: "lintRule.inlang.ensureLowerCapsBrandName",
  // 💡 only triggers on update of a variant, not of a bundle or message
  lintVariant: ({ variant, settings }) => {
    if (variant)
    report()
  }
}
// a code lint rule 
const mylintRule2: LintRule = {
  id: "lintRule.inlang.ensureLowerCapsBrandName",
  // 💡 we can add functions as we need
  lintCode: ({ sourceFile, settings }) => {
    if (hasHardcodedMessage(sourceFile)){
      report()
    }
  }
}

Further notes:

Considered alternatives

Expose the query (get) APIs.

const exampleLintRule = {
  id: "blabla", 
  description: "Missing translations"
  run: ({ query, settings } => {
    // query, mix, and match whatever is needed to perform lints
    const messages = await query.message.get("*")
    // 💥 performance footgun if lint rules query all messages
    for (const message of messages){
      // 
      report(message.bundleId, "missing translation")
    }
    // ...
  }) 
}
samuelstroschein commented 5 days ago

@loris.sigrist @martin.lysk1

martin-lysk commented 5 days ago

Great writeup - We will adopt consider this structure.
This describes what we briefly discussed with - we need lints for different levels (project, bundles, messages, variants) or even whole different entities (figma nodes)

One note:

I would keep one lint rule, one function to avoid situations like "i want to disable the code lint of lint rule X but keep the bundle lint"

If a message should always contain one run method (and i agree 100%) we could keep the structure and give each lint a type enum that discloses the expected parameter of the the function:


enum LintRuleType { 
  Bundle = "Bundle", 
  Message = "Message", 
  Variant = "Variant", 
  Code = "Code", 
  Design = "Design" 
}

// Base LintRule type with discriminated union for each rule type
type LintRule =
  | { id: string, type: LintRuleType.Bundle, run: (bundle: any, settings: any, report: any) => any }
  | { id: string, type: LintRuleType.Message, run: (message: any, settings: any, report: any) => any }
  | { id: string, type: LintRuleType.Variant, run: (variant: any, settings: any, report: any) => any }
  | { id: string, type: LintRuleType.Code, run: (sourceFile: any, settings: any, report: any) => any }
  | { id: string, type: LintRuleType.Design, run: (design: any, settings: any, report: any) => any }

// Example implementation of a lint rule
const exampleLintRule: LintRule = {
  id: "example-rule",
  type: LintRuleType.Code,
  run: (sourceFile, settings, report) => {
    // Implement the linting logic for code
    console.log("Linting code:", sourceFile);
  }
}

Another thought here - we could even use the id pattern to discriminate the lint type - but my type gymnastics end here @loris.sigrist
Any downsides i miss?

samuelstroschein commented 5 days ago

@martin.lysk1 your proposal is identical to mine. we both use union types except that you discriminate via type and overload via run.

thoughts:

propsal:

get rid of the enum :D please confirm if you agree

type LintRule =
  | { id: string, type: "bundle", run: (bundle: any, settings: any, report: any) => any }
  | { id: string, type: "message", run: (message: any, settings: any, report: any) => any }
  | { id: string, type: "variant", run: (variant: any, settings: any, report: any) => any }
  | { id: string, type: "code", run: (sourceFile: any, settings: any, report: any) => any }
  | { id: string, type: "design", run: (design: any, settings: any, report: any) => any }
samuelstroschein commented 2 days ago

Call with @martin.lysk1 and @loris.sigrist agreements


1. which node is a lint report attached to?
-> lint reports only provide bundleId|messageId|variantId

  1. locale as lint report property?
    -> NO, use key
  1. how to create, update, delete
    -> Updating bundles yes. Create and delete in v2.

  2. machine translate
    -> just use inlang/rpc directly in the missing translation lint rule until machine translate plugin api exists


@martin.lysk1 @loris.sigrist please use 👍 to acknowledge that you read this and agree