facebook / starlark-rust

A Rust implementation of the Starlark language
Apache License 2.0
672 stars 53 forks source link

Go to definition should be able to handle `ctx.attr(s).foo` #93

Open MaartenStaa opened 10 months ago

MaartenStaa commented 10 months ago

Currently, go to definition for baz in a dotted foo.bar.baz finds a (top-level) struct definition for foo with attribute bar. The implementor of LspContext should be able to provide hints to the LSP about e.g. ctx.attr.foo, which has something of a special meaning in Buck(2) and Bazel, but not necessarily in Starlark itself. This issue is mainly to discuss what the API might look like, as briefly discussed in https://github.com/facebookexperimental/starlark-rust/pull/92#issuecomment-1700739048.

As a first idea, I came up with this:

// on LspContext
fn get_dotted_identifier_resolve_options<'a>(
    &self,
    current_file: &LspUrl,
    within_function: Option<&'a str>,
    segments: &[&'a str],
) -> anyhow::Result<Vec<ResolveOptions<'a>>>;

// outside LspContext
pub struct ResolveOptions<'a> {
    scope: ResolveScope,
    matcher: ResolveMatcher<'a>,
}

pub enum ResolveScope {
    OnlyTopLevel,
    Current,
}

pub enum ResolveMatcher<'a> {
    OneOf(Vec<ResolveMatcher<'a>>),
    Ident(&'a str),
    Assignment {
        lhs: Option<Box<ResolveMatcher<'a>>>,
        rhs: Option<Box<ResolveMatcher<'a>>>,
    },
    FunctionCall {
        function_name: Option<Box<ResolveMatcher<'a>>>,
        parameters: Option<Box<ParameterMatcher<'a>>>,
    },
    ObjectLiteral {
        key: Option<&'a str>,
        value: Option<Box<ResolveMatcher<'a>>>,
    },
}

enum ParameterMatcher<'a> {
    PositionalParameter {
        index: Option<usize>,
        value: Option<ResolveMatcher<'a>>,
    },
    NamedParameter {
        name: Option<&'a str>,
        value: Option<ResolveMatcher<'a>>,
    },
    // Maybe also something for *args and **kwargs
}

This would allow the LspContext to check for the ctx.attr pattern, and instruct the LSP to look for the appropriate AST nodes. For Bazel, that might look something like this:

fn get_dotted_identifier_resolve_options<'a>(
    &self,
    _current_file: &LspUrl,
    within_function: Option<&'a str>,
    segments: &[&'a str],
) -> anyhow::Result<Vec<ResolveOptions<'a>>> {
    let mut result = vec![];
    if within_function.is_none()
        || segments.len() < 3
        || segments[0] != "ctx"
        || segments[1] != "attr"
    {
        return Ok(result);
    }

    for function_name in ["rule", "repository_rule"] {
        result.push(ResolveOptions {
            scope: ResolveScope::OnlyTopLevel,
            matcher: ResolveMatcher::FunctionCall {
                function_name: Some(Box::new(ResolveMatcher::Ident(function_name))),
                parameters: Some(vec![
                    ParameterMatcher::NamedParameter {
                        name: Some("implementation"),
                        value: Some(ResolveMatcher::Ident(within_function.unwrap())),
                    },
                    ParameterMatcher::NamedParameter {
                        name: Some("attrs"),
                        value: Some(ResolveMatcher::ObjectLiteral {
                            key: Some(segments[2]),
                            value: None,
                        }),
                    },
                ]),
            },
        })
    }

    Ok(result)
}

The current LSP implementation to look for the struct definition would be written as follows (though I think it shouldn't be up to the LspContext to have to define this—it would only define additional lookup patterns).

ResolveOptions {
    // The currently impl is actually `ResolveScope::OnlyTopLevel`, but
    // it really should be `::Current`.
    scope: ResolveScope::Current,
    matcher: ResolveMatcher::Assignment {
        lhs: Some(Box::new(ResolveMatcher::Ident(segments[0]))),
        rhs: Some(Box::new(ResolveMatcher::FunctionCall {
            function_name: Some(Box::new(ResolveMatcher::Ident("struct"))),
            parameters: Some(vec![ParameterMatcher::NamedParameter {
                name: Some(segments[1]),
                value: None,
            }]),
        })),
    },
}

Another option would be to make the AST public (#77), and simply provide callbacks that take in the AST node and some information about the identifier being looked up, and simply returns true or false, or perhaps the matching AST node (so you can match on a function call, but jump to some specific argument). Actually, I suppose the proposal above needs some mechanic as well to do the same. Any thoughts?

ndmitchell commented 10 months ago

I don't have any great ideas here - it seems a really hard problem. The definition is miles away and in a very different format. As such it seems like a hook that by default does nothing, but can be customized arbitrarily, seems like the sensible way of going. But I wonder - how valuable is this? For the amount of work required, it should probably be very valuable.

cjhopman commented 10 months ago

Does the LSP not have type information? I'd expect this to be based on (for buck2) seeing an AnalysisContext type so that we could pass that around in different ways without losing ide functionality. It still needs special handling for the attrs type since the fields on it are context dependent.

I'd think that would be single dotted things each time though, i.e. we'd use information related to the AnalysisContext type to understand that ctx.attrs is an AnalysisAttrs (idk what type it actually is) and then understand ctx.attrs.foo as just the .foo on that type.

I don't see how we really do it once we are passing around the ctx to other functions. Maybe we need the type to take arguments about the attrs that it has, but then maybe we'd want a more structured api around attr definitions... but you could imagine something like:

def some_cxx_utility(ctx: AnalysisContext[CxxToolchainAttrs], ...):
   cxx_toolchain = ctx.attrs._cxx_toolchain
   ...

Now, we should provide stronger guidance that we shouldn't be passing around ctx like that at all (and I'd argue we should split out the attrs entirely and then maybe the whole issue is moot).

ndmitchell commented 6 months ago

The LSP was written after the type checker, so there isn't any type information used in the LSP. We should definitely fix that one day.