obi1kenobi / trustfall

A query engine for any combination of data sources. Query your files and APIs as if they were databases!
Apache License 2.0
2.38k stars 69 forks source link

Converting complex structs into `FieldValue`s? #560

Closed elenakrittik closed 7 months ago

elenakrittik commented 7 months ago

Hello! I am trying to follow the HackerNews example, but currently struggling with implementing BasicAdapter.resolve_property for my adapter. I have a struct ASTFile (which represents source code contained in a file), which has a rust field ASTFile.body: Vec<ASTStatement>, where ASTStatement is an (admittedly giant) enum representing any statement in that source code's programming language:

#[derive(Debug, Clone, enum_as_inner::EnumAsInner)]
pub enum ASTStatement<'a> {
    Annotation(ASTAnnotation<'a>),
    Assert(ASTValue<'a>),
    /// (identifier, kind, value)
    Assignment(&'a str, ASTAssignmentKind, ASTValue<'a>),
    Break,
    Breakpoint,
    Class(ASTClass<'a>),
    ClassName(&'a str),
    Continue,
    If(ASTValue<'a>, CodeBlock<'a>),
    Elif(ASTValue<'a>, CodeBlock<'a>),
    Else(CodeBlock<'a>),
    Enum(ASTEnum<'a>),
    Extends(&'a str),
    /// (identifier, type_hint, container, body)
    For(&'a str, Option<ASTValue<'a>>, ASTValue<'a>, CodeBlock<'a>),
    Func(ASTFunction<'a>),
    Pass,
    Return(ASTValue<'a>),
    /// (name, args)
    Signal(ASTSignal<'a>),
    /// (expression_being_matched_on, vec_of_patterns)
    Match(ASTValue<'a>, Vec<ASTMatchPattern<'a>>),
    While(ASTValue<'a>, CodeBlock<'a>),
    Variable(ASTVariable<'a>),
    Value(ASTValue<'a>),
}

My vertex has only one starting edge - the file itself:

#[derive(Debug, Clone, TrustfallEnumVertex)]
pub enum Vertex<'a> {
    File(Rc<ASTFile<'a>>),
}

But when i try to implement resolve_property and return ASTFile's .body as follows:

    fn resolve_property<V: trustfall::provider::AsVertex<Self::Vertex> + 'a>(
        &self,
        contexts: trustfall::provider::ContextIterator<'a, V>,
        type_name: &str,
        property_name: &str,
    ) -> trustfall::provider::ContextOutcomeIterator<'a, V, trustfall::FieldValue> {
        match (type_name, property_name) {
            ("File", "body") => resolve_property_with(contexts, field_property!(as_file, body)),
            _ => unreachable!(),
        }
    }

I get this error:

the trait bound `FieldValue: From<ASTStatement<'_>>` is not satisfied
the following other types implement trait `From<T>`:
  <FieldValue as From<bool>>
  <FieldValue as From<i8>>
  <FieldValue as From<i16>>
  <FieldValue as From<i32>>
  <FieldValue as From<i64>>
  <FieldValue as From<u8>>
  <FieldValue as From<u16>>
  <FieldValue as From<u32>>
and 11 others
required for `ASTStatement<'_>` to implement `Into<FieldValue>`
required for `FieldValue` to implement `From<Vec<ASTStatement<'_>>>`
1 redundant requirement hidden
required for `Vec<ASTStatement<'_>>` to implement `Into<FieldValue>`

FieldValue only has variants for primitive types, so i can't implement Into<FieldValue> either (as it is a composite type).

Here i see a usage of byUser on a Comment (which to me looks like what i need to resolve this issue), but in the code i don't see a match arm for it, only byUsername. Magic.

Any tips? If my AST structure is problematic, i'll be happy to refactor it. Here is the PR if you want to test some things yourself. Thanks in advance.

obi1kenobi commented 7 months ago

Ah, sorry you ran into this — this is a symptom of our poor onboarding. I'm making a note to address this in our docs.

byUser is an edge, not a property, so it's defined here in resolve_neighbors.

Properties are only scalar or list-of-scalar (or list-of-list...) values, all of which are expressible as FieldValue. Anything more complex than that should be modelled as an edge, unless you'd prefer to encode it in some way (e.g. JSON).

Any tips? If my AST structure is problematic, I'll be happy to refactor it.

Two things I'd recommend.

First, I'd recommend going about adopting Trustfall in the opposite direction: schema and queries first, not code-first. Often by thinking about the set of queries you'd like to accomplish, you'll find cleaner ways to model the schema. Keep in mind that Trustfall's declarative query language isn't as expressive as Rust, so a great schema is the key to making your queries ergonomic, elegant, and sometimes even possible at all. I strongly recommend against directly porting a Rust representation 1-to-1 into a Trustfall schema, because you'll quickly run into ergonomics and expressiveness limitations.

Second, I'd recommend modeling the high-level "business logic" view of the data, not the AST or implementation details of it. So: classes have functions, classes may extend other classes, functions take arguments and have return types, etc. Not: classes contain a series of tokens, some of which may indicate they extend another class, others that may define functions, which in turn have some tokens that describe what might be arguments or return types, etc. etc. In the former, writing a Trustfall query like "class that extends X and has function Y" is trivial. In the latter, it's a nightmare.

The ability to model your data independently of the underlying representation is the most powerful tool Trustfall provides. This was the key thing that made cargo-semver-checks viable as a tool — there the underlying representation changes ~once a month, but our schema and queries do not. Choosing a great schema is the difference between feeling like Trustfall is awesome and feeling like it's an overhyped pile of junk 😅

It takes a lot of practice to become able to design a large schema all up front and at the right abstraction level so it feels ergonomic and empowering. Consider starting small with a few queries over a subset of your data (e.g. classes, extends relationships, functions, and function arguments + return type) and then extending from there.

elenakrittik commented 7 months ago

Hi! Been a little busy, and wanted to acknowledge that i saw your comment, you're truly helpful, and whenever i get back to this i'll bombard you with more questions report my progress :) I think it's not really poor onboarding, it's just me looking way too fast through the docs/code. Will close this as this is a "me" problem, and will use discussions for general questions in the future!

obi1kenobi commented 7 months ago

No worries, feel free to open either issues or discussion topics — I'm always happy to help! Thanks again for trying out Trustfall. It's really helpful to know what things seem confusing, and I've been taking copious notes on what topics to tackle in the docs as a result of questions like this.