twig engine skeleton - part II

colin-kiegel commented 8 years ago

Here is some skeleton code - as a suggestion.

I stripped out my parser, lexer, loader and runtime implementations and I refactured what's left according to our previous discussion #10.

The facade is currently called Engine
The Environment is currently called Setup
The unicorn module is currently called engine

So what's left is the initialization logic of Engine - i.e. this works:

use twig::{Engine, Setup};
let twig = Engine::new(Setup::default()).unwrap();
// ..

And this works too (although extension::Debug is only a dummy):

use twig::{Setup, Engine};
use twig::extension::Debug;

let mut setup = Setup::default()
     .set_strict_variables(true)
     .add_extension(Debug::new()).unwrap();
let engine = Engine::new(setup).unwrap();

And it contains some error-code definitions.

colin-kiegel commented 8 years ago

Please tell me, what you think. ;-)

colin-kiegel commented 8 years ago

PS: As a suggestion, I moved the tokens inside of the parser like the lexer, i.e. twig::engine::parser::tokens because I think they are a contract between parser and lexer - outside of the parser module they are only relevant to extensions defining token parsers. It was a little test - we can relocate to twig::engine::tokens. Or alternatively try to re-export some extension-stuff in extension::api.

colin-kiegel commented 8 years ago

PPS: Since this Pull Request is still rather large - I recommend a local checkout to get a feeling for it. ;-)

colin-kiegel commented 8 years ago

Shall I split it into some smaller chunks?

Nercury commented 8 years ago

OK, let's continue. Some interesting things in life go in a way a bit :)

I would like to finish up the lexer before starting with parser. However, I already was studying your parser before and I think we should not try to do it like PHP does in this particular case.

Instead of having a GenericNode with attributes, I suggest to construct static node hierarchy. If some node hierarchy item needs to be extensible, it is still very doable, but this extension point can simply be explicit trait object.

colin-kiegel commented 8 years ago

np - lucky for you. ;)

Sure, doing it step by step sounds like a good plan.

I removed attribute-related methods from Node trait. It now reads

pub trait Node : Debug {
    fn tag(&self) -> &str;
    fn position(&self) -> &Position;
    fn children(&self) -> &Vec<Box<Node>>;
    fn children_mut(&mut self) -> &mut Vec<Box<Node>>;
}

I removed GenericNode, since it would be too early for those kind of details.

I think I will focus on the two remaining 'holes' in my lexer next: interpolation and string state are both unimplemented(). I'll try to fix that the next days and also have a look if I can reuse some of your code for that. Anyway in this pull request both lexer and parser are basically not-existent.

colin-kiegel commented 8 years ago

PS: I also had a bad feeling how twigphp used the attribute array for complicated magic. I share your feeling that we should try to make this logic more explicit.

Nercury commented 8 years ago

I think I will focus on the two remaining 'holes' in my lexer next: interpolation and string state are both unimplemented()

Hey, that's implemented in my code :). Let me focus on merging my lexer into yours, and it will become magically implemented :). And I will learn your code better.

About magic and attributes - well, I found it is very much possible to get rid of it.

Take a look at Expr node here: nodes/node/expr.rs

And here the parsing for Expr, matching PHP expression parser implementation: nodes/parser/expr.rs

It kind of proves that it is very much possible :)

The problem I see in Twig code, is that it tries to be very abstract, but still relies on instanceof calls for type checking. The problem with that, however, is that once it has these hardcoded instanceof checks in core, the abstraction is leaked, and there is no point in it anymore.

You might ask, what if extension wants to add a new kind of node in this hierarchy? I suggest that in this case we add another node type Expr::Custom. So far I had no need for it, although I admit that my parser is far from complete.

Nercury commented 8 years ago

By the way, this is going to clash horribly with the lexer skeleton :/

Also, I want to leave errors ungeneralized, especially bubble up the SyntaxError to top as the most important error you can receive from twig.

I would like the hierarchy in single crate boundaries to be done in very boring and obvious way. Enums.

Of course, we will inevitably have top master TwigError that we will have to expose to external users, that's where we will implement the std::error::Error trait.

Also the trace with locations in rust source code is cool, but I very much doubt the usefulness of that. Of course, that's my opinion.

Here we go, said all the things that were piling up :)

colin-kiegel commented 8 years ago

I understand the bubbling-up-thing. I will try to rework the error-Display-representation to (at least) mimic this behaviour. Still I find the location trace more than useful. I am not the kind of guy that likes stepping forth and back with a debugger and setting breakpoints. Instead I want fast and rich debug information. I know it's not for free - its like an investment and will slow down initial implementation a bit - but the return will be way faster debugging. I think it will enable us to understand and fix some errors solely based on Debug::fmt on errors instead of cumbersome local debugging. I would still like to hold on to that, if we can achieve a nice representation for users too.

Nercury commented 8 years ago

I get that. I was doing the same using logs.

How about we implement tracing as a layer on any error, so there is no need for trait objects. For example, turn Error<T> into struct like this:

/// Runtime error with stack trace.
#[derive(Clone, Debug)]
pub struct Traced<T: Clone, Debug> {
    pub error: T,
    /// Trace entries which may have location and/or another error. Maybe another struct.
    pub trace: Vec<(Location, Option<T>)>,
}

/// Implement display where T: Display
/// Maybe implement From<Traced<OtherT>> where T: From<OtherT>

The difference here would be that all entries in trace would probably have to be converted into T, which would be publicly visible "main" TwigError.

colin-kiegel commented 8 years ago

Ok, here are some changes:

refactor Extension + ExtensionRegistry (tell don't ask)
move to api::{lexer, node, parser, token}
move engine::Setup to setup::Setup
some cleanup (like removing some error codes, option::Charset, etc.)

You might ask, what if extension wants to add a new kind of node in this hierarchy? I suggest that in this case we add another node type Expr::Custom. So far I had no need for it, although I admit that my parser is far from complete.

In my mind extensions define most node types by themselves, so we need to define what we expect those nodes to implement. Even if and for``nodes could just be defined by thecore` extension. We could at least try, if that works.

Nercury commented 8 years ago

so we need to define what we expect those nodes to implement.

I expect them to implement:

Abstract extension, which configures what is available for lexer and parser;
Abstract token parser, which transforms token stream into node tree.
Function evaluations, so they can be dispatched to extensions at runtime (for functions, filters, operators).

The node tree will be evaluated, somehow. If the base collection of nodes is enough for extension to achieve what's needed, we don't need abstract nodes here.

The issue arises because some extension might want to add something to node tree that can not be evaluated by base collection of nodes. In this case the evaluation will need to be provided by extension anyways, so I would argue that in such case an extension can tell explicitly what the evaluation is in the form of CustomNode.

I imagine two routes for this:

CustomNode<Box<Evaluate>>. Here, the extension provides a way to evaluate this node.
CustomNode { id: u32 }. Here, the extension must provide evaluation for id the same way it provides token parsers for tokens by token name.

I think the second way is more flexible, because the extension may implement multiple ways to evaluate node for different backends.

I hope I managed to show here that abstraction can be done in different ways and putting it in the center (Node) is not necessary.

Nercury commented 8 years ago

Continued in #20.

rust-web / twig

twig engine skeleton - part II #12