arrow-kt / arrow-meta

Functional companion to Kotlin's Compiler
https://meta.arrow-kt.io
Apache License 2.0
395 stars 42 forks source link

Improve IDE (as you type) synth resolution running the Quote system #96

Open raulraja opened 4 years ago

raulraja commented 4 years ago

Synthetic resolution based on computeCache only happens when the files are compiled and in .class form in the local module output.

https://github.com/arrow-kt/arrow-meta/blob/cee5d0eebe12f89fdc414eb7490150f1135df5c2/idea-plugin/src/main/kotlin/arrow/meta/ide/phases/resolve/MetaSyntheticPackageFragmentProvider.kt#L182

This is inconvenient because users get redlines and no auto-completion help until they compile.

We would like to be able to run Meta there automatically as the user is typing on the editor buffer to offer the same SynthDescriptors we are offering now but before it compiles the full project.

Additionally, @jereksel pointed out an extension in the IDE we are not using and apparently apparently may help us solve some of these problems. We need to investigate it and see how we can use it. org.jetbrains.kotlin.idea.core.extension.KotlinIndicesHelperExtension

I have the feeling that this extension may help us inject automatically functions and declarations which are callable descriptors to contribute to their scope as the user types. Running the quote system on each keystroke may not be expensive if we do something like:

  1. Grab the KtFile we are on and feed to a quote
  2. The quote then already traverses the PSI tree and matches it
  3. The quote internally already generates a new KtFile
  4. We use the resolve API to partially type check the new KtFile as it is done now in the computeCache function linked above and add it to the computed cache before compiling. This generates the synth descriptors which are fed to the PackageFragmentProvider and SynthResolverExtension automatically since these are invoked as the user types and the user should see the new synth descriptors visible in the editor.
jansorg commented 4 years ago

@raulraja, I wasn't sure who's best to answer this.

Here are questions for a few problems I encountered:

  1. Starting the quote system on the current file only is not enough. The current buffer may reference Meta-data which is contained in another file or even in another module of the same project. The complete solution is probably to run all *.kt files of all modules where Meta is active through the quote system. This is what we had when we used the .class files generated by the compiler. Does this sound reasonable?
  2. We have to be extra careful here to
    • avoid memory leaks. This could happen easily because we're caching PSI elements.
    • be fast. My thinking is that the current buffer should be run through the quote system whenever it changes. All the other *.kt files should be only run through the quote system at startup and whenever they change. We could also cache this on disk, if it turns out to be slow.
  3. The fragment provider is currently returning our meta for all modules and dependencies, including jars and libraries. We probably should limit this to those modules, which contain meta-generated code. As far as I understand this means, that:
    1. only "real" modules may have meta-data, this is modules of the current project which have source files. Libraries may contain data generated by meta. But these should resolve without our fragment provider and synthetic resolve extensions because IntelliJ is relying on the .class files here.
    2. What would be the best way to detect if a module could contain data generated by Meta? Is this a module with a gradle build file? How could one detect if Meta is active in the gradle file? What about gradle build files written in kotlin?
raulraja commented 4 years ago

@jansorg I'll reply to each question in order below:

Starting the quote system on the current file only is not enough. The current buffer may reference Meta-data which is contained in another file or even in another module of the same project. The complete solution is probably to run all *.kt files of all modules where Meta is active through the quote system. This is what we had when we used the .class files generated by the compiler. Does this sound reasonable?

  1. Yes, that is reasonable, you can recompute each time it goes into it the DiagNosticSupressor with a diagnostic factory of org.jetbrains.kotlin.diagnostics.Errors#UNRESOLVED_REFERENCE. this indicates the regular Kotlin type checking did not find the symbol but it allows you to retrigger a cache rebuild since that is invoked after the Analysis and Resolution phase right before highlighting.

We have to be extra careful here to avoid memory leaks. This could happen easily because we're caching PSI elements. be fast. My thinking is that the current buffer should be run through the quote system whenever it changes. All the other *.kt files should be only run through the quote system at startup and whenever they change. We could also cache this on disk, if it turns out to be slow.

  1. Since the quote system runs each file at a time a full pass first and then a per file on diagnosis may be a good approach. Some quote transformations will perform real ktfile generation like kapt does, this is landing on master soon by @bloderxd and then, in that case, a quote may affect multiple files. Each time analysis happens the ModuleDescriptor instance changes and it's not safe to cache descriptors that point to a different module instance even when both pointed to the same module. So you may have to run quotes after analysis in someplace like the diagnostic suppressor anyway if we start seeing exceptions like the ones I'm seeing with the Proof system.

The fragment provider is currently returning our meta for all modules and dependencies, including jars and libraries. We probably should limit this to those modules, which contain meta-generated code. As far as I understand this means, that: only "real" modules may have meta-data, this is modules of the current project which have source files. Libraries may contain data generated by meta. But these should resolve without our fragment provider and synthetic resolve extensions because IntelliJ is relying on the .class files here.

  1. The fragment provider provides descriptors not just of the modules in the project but those representing libraries and third-party dependencies, you can observe this by asking it to give you all the sub-packages of FqName.ROOT. This is needed to scan classes and other artifacts that are coming from third-party libraries and to conform the synthetic descriptors found in the local projects with the types and descriptors of those third-party linked libs. You can have ad-hoc fragment providers. I have also a caching strategy for the proof system that I'm developing in a separate branch rr-type-conversions that exclusively uses the cache. The current flow I'm following is:
  2. I let the highlighter do its job naturally
  3. If it goes into diagnostics for unfound references I rebuild the proof cache
  4. Everywhere else in the system reads from a public property that hides the cache.
  5. User keeps typing and go to step 1. Reevaluating the proof cache currently takes around 15ms for over 40 proofs which are each one synth descriptors for functions that need to be analyzed.

What would be the best way to detect if a module could contain data generated by Meta? Is this a module with a gradle build file? How could one detect if Meta is active in the gradle file? What about gradle build files written in kotlin?

A module activated by Meta would be any module in which in its Compiler Settings accessible in the org.jetbrains.kotlin.extensions.CollectAdditionalSourcesExtension phase include the arrow meta plugin. You can parse it out of gradle or other places too depending on the setup.

Hope this helps and let me know if you have any more questions :)

jansorg commented 4 years ago

@raulraja Thanks!

  1. I don't fully understand the note about DiagnosticSupressor. My thinking was that I refresh the transformations and descriptors after changes to the editor (e.g. after typing a character) and after vfs changes to other files (e.g. save and refresh). A refresh when unresolved references are found should work, too. It probably happens a bit later. Or did I miss something here?

Another question: currently, the Lens (?) transformation transforms data class Person(val name: String, val age: Int) into something like this:

data class Person public constructor (val name: String, val age:  Int) {

  companion object {
    @arrow.synthetic val name: arrow.optics.Lens<Person, String> = arrow.optics.Lens(
  get = { person -> person.name },
  set = { person, name -> person.copy(name = name) }
)
@arrow.synthetic val age: arrow.optics.Lens<Person, Int> = arrow.optics.Lens(
  get = { person -> person.age },
  set = { person, age -> person.copy(age = age) }
)
    @arrow.synthetic val iso: arrow.optics.Iso<Person, Pair<String, Int>> = arrow.optics.Iso(
  get = { (name, age) -> Pair(name, age) },
  reverseGet = { (name, age) -> Person(name, age) }
)
  }
}

The data class data class Person is redefined with a new companion object. But it's not marked as @arrow.synthetic (in the code in my branch, if that's relevant). Because it's not @synthetic it's not used by the fragment provider and the synthetic resolve extension. And the result is that the properties age and name are not when used for the companion object.

My assumption is that adding @synthetic on the data class would result in a duplicate definition (once in the original file and once in the transformed file). Would there be another way to generate the code or have you already encountered this scenario?

raulraja commented 4 years ago

@jansorg just discussed with Simon. You can disable the lens plugin for the time being. We may address lenses with just type proof extensions which are like type classes to project the syntax and not even use quotes at all

raulraja commented 4 years ago

The diagnostic suppressor kicks in every time an unresolved reference is found. This is conceptually the time where if baseline Kotlin failed to parse we want to provide synth descriptors. Having said that you can bootstrap the cache reload wirh wherever it makes sense for the current lifecycle you are dealing with.

jansorg commented 4 years ago
  1. Since the quote system runs each file at a time a full pass first and then a per file on diagnosis may be a good approach. Some quote transformations will perform real ktfile generation like kapt does, this is landing on master soon by @bloderxd and then, in that case, a quote may affect multiple files. Each time analysis happens the ModuleDescriptor instance changes and it's not safe to cache descriptors that point to a different module instance even when both pointed to the same module. So you may have to run quotes after analysis in someplace like the diagnostic suppressor anyway if we start seeing exceptions like the ones I'm seeing with the Proof system.

@raulraja cc @bloderxd Is there an example of a Quote which affects more than one file?

For my understanding: could quote transformations defined in file A.kt influence the transformation of a file B.kt? If yes, would it be possible to find out about the dependency without executing the actual transformation, i.e. to know that an update after a change of B.kt will require a transformation of both A.kt and B.kt?

I'm figuring out the best way to handle the transformation. Hard references on PSI elements is considered bad practice and easily creates memory leaks. IntelliJ's way to handle that is using CachedValue, but that a few strong requirements, e.g. to provide a list of dependencies on other PsiElements. Using this mechanism might be possible when I know more about the way the quote transformations work (i.e. questions above).

Regarding the ModuleDescriptor and caching: were you referring to the descriptor passed to MetaSyntheticPackageFragmentProvider here? Here the descriptor seems to be re-used after changes, at least when I checked in the debugger, i.e. the same instance of ModuleDescriptor is used again for the same IntelliJ modules. Knowing when you noticed a change of the ModuleDescriptor would be great.

The fragment provider provides descriptors not just of the modules in the project but those representing libraries and third-party dependencies, you can observe this by asking it to give you all the sub-packages of FqName.ROOT. This is needed to scan classes and other artifacts that are coming from third-party libraries and to conform the synthetic descriptors found in the local projects with the types and descriptors of those third-party linked libs.

Could you give an example of a Quote which scans/uses files not present in the module where its used? That would help to create a test-case for this. (Right now the arrow fragment provider is also returning a DescriptorCachePackageFragmentProvider for libraries like kotlin-stdlib, the JDK, and others. I wasn't sure if we need to do that. Might be good to only provide where necessary.)

raulraja commented 4 years ago

@jansorg A quote transformation on A.kt will not affect transformations of B.ktas the quote system as esoteric as it can go at the moment is to not mutate the tree in place and instead codegen new ktfiles but those are unrelated to the A and B file issue. Of course, a user could easily create a transformation that does but it's perhaps not an issue we should confront now to try and keep it simpler to start with.

Regarding the module descriptor, I found a place in which a new module is getting ready to be analyzed and before and after analysis. All of this can be achieved by replacing the KotlinCodeAnalyzer as I'm doing in my branch. You can see in the same file custom versions of the BindingTrace, BindingContext and other internals in which the module descriptor is passed around. If you enable full logging on all methods you can see these getting called continuously. https://github.com/arrow-kt/arrow-meta/blob/f3e99ed602e15883528588bc5b4e4755fcd1dbff/idea-plugin/src/main/kotlin/arrow/meta/ide/phases/resolve/proofs/MetaCodeAnalyzerInitializer.kt

jansorg commented 4 years ago

@raulraja Thank you for the details and the pointer to the KotlinCodeAnalyzer. That seems helpful.

For now, I'd like to get the transformations implemented properly. After that I plan to add more tests and to make sure that resolving is working for all code generated by Meta. The code analyzer and other classes you pointed me at will be useful for that.

I'm sorry, but I still have questions about the transformations. My question wasn't precise enough.

I understood that a transformation of A.kt doesn't do transformations in place. It results in a new PsiFile _meta_A.kt (by changeSource in Quote.kt). The same applies for B.kt. PsiElements taken from these _meta_.kt PsiFiles are later used to implement the resolving.

To implement caching in the way which is advised by JetBrains I need to be sure that:

  1. a transformation of A.kt into _meta_A.k is only accessing Psi elements from A.kt and not from anywhere else, i.e. that it's not using B.kt or any other data.
  2. a transformation of A.kt is only resulting in _meta_A.kt and is not modifying or updating anything else. Your note above seems to indicate that this is not the case. Is that correct?

    Some quote transformations will perform real ktfile generation like kapt does, this is landing on master soon by @bloderxd and then, in that case, a quote may affect multiple files.

Can you tell if (1) and (2) are guaranteed?

raulraja commented 4 years ago

FTR discussed over slack an approach to cache and analyze files that may not be intrusive to the user as they type and proceed after this feature with optimizations to improve the internal hot spot on quotes related to the creation of ktfiles on each transformation

On Fri, Dec 13, 2019, 5:47 PM J. Ansorg notifications@github.com wrote:

@raulraja https://github.com/raulraja Thank you for the details and the pointer to the KotlinCodeAnalyzer. That seems helpful.

For now, I'd like to get the transformations implemented properly. After that I plan to add more tests and to make sure that resolving is working for all code generated by Meta. The code analyzer and other classes you pointed me at will be useful for that.

I'm sorry, but I still have questions about the transformations. My question wasn't precise enough.

I understood that a transformation of A.kt doesn't do transformations in place. It results in a new PsiFile _meta_A.kt (by changeSource in Quote.kt). The same applies for B.kt. PsiElements taken from these meta.kt PsiFiles are later used to implement the resolving.

To implement caching in the way which is advised by JetBrains I need to be sure that:

  1. a transformation of A.kt into _meta_A.k is only accessing Psi elements from A.kt and not from anywhere else, i.e. that it's not using B.kt or any other data.
  2. a transformation of A.kt is only resulting in _meta_A.kt and is not modifying or updating anything else. Your note above seems to indicate that this is not the case. Is that correct?

Some quote transformations will perform real ktfile generation like kapt does, this is landing on master soon by @bloderxd https://github.com/bloderxd and then, in that case, a quote may affect multiple files.

Can you tell if (1) and (2) are guaranteed?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/arrow-kt/arrow-meta/issues/96?email_source=notifications&email_token=AADPQXFFHHUQQV2DMSEDKJLQYO4DVA5CNFSM4JKWXU4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG2QJ7Q#issuecomment-565511422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADPQXHRCJEMBK2KTAGG6ULQYO4DVANCNFSM4JKWXU4A .