google / haskell-indexer

Emits code crossreference data for Haskell sources.
99 stars 20 forks source link

Index post-processed sources #58

Open robinp opened 7 years ago

robinp commented 7 years ago

Brought up by @mpickering. A few things to sort out for that question:

+@creachadair: does for example the Kythe C++ indexer emit virtual fragments for un-CPP-d code? Do you have any takeaways from earlier attempts on this topic?

creachadair commented 7 years ago

does for example the Kythe C++ indexer emit virtual fragments for un-CPP-d code? Do you have any takeaways from earlier attempts on this topic?

I'm not entirely sure I understand what you mean by "virtual fragments". But to your other questions: We don't currently store the fully-preprocessed versions of files—the indexer does hook some of the Clang preprocessor actions to capture (e.g.) macro definitions, #include lines, and so on, and to keep track of the state we need to disambiguate variant expansions of the same file under variations of #define settings, inclusion order, and so forth. But the only source text we capture are the original files.

From a captured compilation unit, you could of course set up clang and actually capture the CPP output, but we haven't done that so far. The only obvious justification would be to try to reify macro expansions, but that's problematic because macro expansion is not layered, so you can't practically keep track of nested expansions (which are common). Perhaps more pernicious than that, a C macro expansion isn't even required to emit a syntactically-complete form, so the relationship between the notional "visible" syntax of the file (where a macro expansion looks like a variable or a function call) and the underlying C AST is very fiddly.

We've talked about trying to do something more concrete with macros, but so far there hasn't been a productive UI query to work from.

robinp commented 7 years ago

Mind dump: the main usecases I can see for postprocessed source indexing:

1) Compiler generated splices. The compiler can auto-derive instance implementations, whose sources are not visible by default. But these instances (Show, Data, ...) are not very interesting.

2) TH generated splices. The TemplateHaskell expansions could be put somewhere and crossreferenced.

3) CPP macro expansions. Like above, but for expanded C-preprocessed macros.

For both 2) and 3) what happens now is that the bindings are put inside the span of the TH/macro invocation in some unpredictable (and unclickable) manner.

If we could stash the expansions somewhere (even as individual fragments), maybe we could connect the invocation anchor with the fragment anchors in a way that makes sense for the UI navigation (no idea exactly how).

mpickering commented 7 years ago

For 2/3 some of the infrastructure I use for core-kythe would be useful as you would need spans or pretty prjnted output.