cloudspannerecosystem / memefish

memefish is the foundation to analyze Spanner SQL
https://cloudspannerecosystem.dev/memefish/
MIT License
76 stars 19 forks source link

Add the support of the sequence functions #63

Closed git-hulk closed 1 year ago

git-hulk commented 1 year ago

For more information can refer: https://cloud.google.com/spanner/docs/reference/standard-sql/sequence_functions

makenowjust commented 1 year ago

IMO, we should treat SEQUENCE expressions as argument modifiers like INTERVAL.

https://github.com/cloudspannerecosystem/memefish/blob/1b120de904be1345085ee862d3d061476dc2db2f/ast/ast.go#L881-L896

However, adding new fileds to Arg seems bad to me. We should split Arg to a new interface Arg and the following three Arg types are needed:

@git-hulk What do you think?

git-hulk commented 1 year ago

IMO, we should treat SEQUENCE expressions as argument modifiers like INTERVAL.

https://github.com/cloudspannerecosystem/memefish/blob/1b120de904be1345085ee862d3d061476dc2db2f/ast/ast.go#L881-L896

However, adding new fileds to Arg seems bad to me. We should split Arg to a new interface Arg and the following three Arg types are needed:

  • IntervalArg
  • SequenceArg
  • ExprArg

@git-hulk What do you think?

Good point, SequenceArg is exactly better than SequenceExpr. I will fix it soon.

makenowjust commented 1 year ago

@git-hulk You assume all sequence functions take only one argument. It is true so far, but will it be true in the future? I think this assumption is poorly extensible and we should not accept such an assumption.

Could you check my above comment again? I guess you have not understood this yet. Feel free to ask me if you have any questions/comments!

git-hulk commented 1 year ago

@git-hulk You assume all sequence functions take only one argument. It is true so far, but will it be true in the future? I think this assumption is poorly extensible and we should not accept such an assumption.

Could you check my above comment again? I guess you have not understood this yet. Feel free to ask me if you have any questions/comments!

Yes, I misunderstood your comment. What you mean is to treat the Sequence function as a normal function, and then support parsing the sequence argument in parseArg, is that right?

makenowjust commented 1 year ago

@git-hulk Yes, that's right!

git-hulk commented 1 year ago

@git-hulk Yes, that's right!

Done, a lot of sorry for misunderstanding your first comment.

makenowjust commented 1 year ago

@git-hulk Okay. We encountered the next problem: mismatching between the AST definitions and the parser structure.


For now, parseArg can return one of three kinds of arguments: SEQUENCE arguments, INTERVAL arguments, or usual arguments. However, the Arg type only supports INTERVAL and usual arguments, but SEQUENCE arguments are represented by Arg of SequenceArg expression. In other words, parseArg is now parseArgOrArgOfSequenceArg. It sounds really bad to me.

The first solution to this problem is extending Arg to support SEQUENCE argument like:

type Arg struct {
    // pos = Interval || Sequence || Expr.pos
    // end = (IntervalUnit ?? Expr).end

    Interval token.Pos // position of "INTERVAL" keyword
    Sequence token.Pos // position of "SEQUENCE" keyword

    Expr         Expr
    IntervalUnit *Ident // optional
}

However, it seems wrong to me because this Arg can have two states at the same time. That is, when both Interval and Sequence of Arg are valid, it is not possible to determine which one should be treated correctly.


So, let's make Arg an interface and separate the roles of Arg (usual, INTERVAL, and SEQUENCE arguments).

type Arg interface {
    Node
    isArg()
}

func (ExprArg)     isArg()
func (IntervalArg) isArg()
func (SequenceArg) isArg()

type ExprArg struct {
    // pos = Expr.pos
    // end = Expr.end

    Expr Expr
}

type IntervalArg struct {
    // pos = Interval
    // end = Unit.end

    Interval token.Pos // position of "INTERVAL" keyword

    Expr Expr
    Unit *Ident
}

type SequenceArg struct {
    // pos = Sequence
    // end = Expr.end

    Sequence token.Pos // position of "SEQUENCE" keyword

    Expr Expr
}

Then, the parseArg should be rewritten:

func (p *Parser) parseArg() ast.Arg {
    if i := p.tryParseIntervalArg(); i != nil {
        return i
    }
    if s := p.tryParseSequenceArg(); s != nil {
        return s
    }
    return p.parseExprArg()
}

Finally, by implementing tryParseIntervalArg, tryParseSequenceArg, and parseExprArg, the AST definitions and the parser structure are matched.


Sorry for the long comment. This is my intention for the first comment. You could also feel free to ask me.

git-hulk commented 1 year ago

I indeed thought about the solution when writing the last command, but it needs to refactor the struct arg into an interface so gave up this without asking. Thanks for your detail comment.

git-hulk commented 1 year ago

@makenowjust Fixed it according to your comments, but one thing to be noticed is that will change the call expression's arguments structure.

git-hulk commented 1 year ago

Done, can help to take a look while you're free.

makenowjust commented 1 year ago

@git-hulk Thank you, and sorry for letting you rewrite it many times.

git-hulk commented 1 year ago

@git-hulk Thank you, and sorry for letting you rewrite it many times.

My bad for missing so many places, thanks for pointing out.