google / ksp

Kotlin Symbol Processing API
https://github.com/google/ksp
Apache License 2.0
2.85k stars 268 forks source link

Support source set-specific code generation in Multiplatform projects #965

Open OliverO2 opened 2 years ago

OliverO2 commented 2 years ago

As shown in #963, this repository's multiplatform example generates identical code for all source sets, resulting in "Redeclaration" compiler errors once dependencies are set up correctly. While this example can be fixed by generating code for the commonMain source set only, this will not work with projects requiring separate code sets generated for different source sets.

OliverO2/kotlin-multiplatform-ksp contains an example project where this problem has been solved via source set detection:

https://github.com/OliverO2/kotlin-multiplatform-ksp/blob/214eef9e529ca3dae8a6e38123790b4795302969/symbol-processor/src/jvmMain/kotlin/QualifiableProcessor.kt#L37-L38

https://github.com/OliverO2/kotlin-multiplatform-ksp/blob/214eef9e529ca3dae8a6e38123790b4795302969/symbol-processor/src/jvmMain/kotlin/QualifiableProcessor.kt#L55-L56

As there seems to be no reliable API providing the required information on input and output source sets, the solution is unstable.

In addition, the output source set is only detected via the codeGenerator's generated file path, which makes it unsuitable for collecting information before deciding on code generation.

Would it seem feasible for a future KSP release to provide information on input and output source sets via the SymbolProcessorEnvironment or the Resolver?

OliverO2 commented 2 years ago

Some thoughts.

Use cases for source set-aware KSP processing

  1. Have a processor generate code from symbols in its current (output) source set only.
    • Condition: process symbols from files where input source set == output source set
  2. Have a processor generate expect/actual code, triggered by annotations in a shared source set.
    • Conditions for generating expect code:
      • process annotations from files where input source set == output source set
    • Conditions for generating actual code:
      • process annotations from anywhere (typically a single shared source set)
      • the output source set is a leaf (target) source set
      • the type of code depends on the name or type (JVM, Js, ...) of the output source set

Causes and consequences of KSP processors lacking source set information

Information required for the above use cases

  1. is the current file/symbol part of the output source set?
  2. name or type of the output source set; leaf-type

Suggestion to solve the issue

Depending on the outcome of #1021, I'd be willing to come up with a PR to implement the above. What do you think?

neetopia commented 2 years ago

Sorry I am currently packed with other tasks, will reach back (also follow up on your PR) probably next week.

OliverO2 commented 2 years ago

@neetopia That's fine with me. As I'm also working on other tasks, I've just tried to wrap up everything I had in mind for KSP and discover related issues, so that once you are ready, work can continue without losing too much context.

ting-yuan commented 2 years ago

If the goal is to make the outputs observable to downstreams, one approach could be excluding (human written) parent sources from getNewFiles/getAllFiles/getSymbolsWithAnnotation/etc while keeping them in resolution scope, as if parents were already processed in some previous rounds. In that way

  1. No explicit API to tell where the sources are from is needed.
  2. Each (human written) file is processed exactly once by default.

KSP still needs to find out the source sets for the inputs, but that can be kept in implementation details.

Compared to the current model, an issue shared between the above approach and the proposed, explicit API is that, intermediate source sets may lose some context. For example, jvm specific functions won't be available when processing common source sets. Probably not a blocker though.

OliverO2 commented 2 years ago

Excluding (human written) parent sources from getNewFiles/getAllFiles/getSymbolsWithAnnotation/etc. would immediately solve use case 1 (generate code from symbols in its current (output) source set only). It would not cover use case 2 (generate expect/actual code, triggered by annotations in a shared source set). The latter would require full source set-awareness in the processor and complete access to parent source set symbols.

If a processor is to generate some (non-trivial) source set-specific code, it must have some idea about the source set's targets and know which (library) APIs are available. Such information could be conferred via source set-specific configuration (arg option) or determined via the (hardwired) source set name in the processor itself. For completeness, we'd have to keep in mind the edge case of #1037, when there might not be a separate compilation for some intermediate source sets.

bcmedeiros commented 1 year ago

Posted a workaround in https://github.com/google/ksp/issues/963#issuecomment-1704569105 in case all you need is to generate code for the commonMain source set.