Open JayWhite2357 opened 1 month ago
@iajoiner may be able to provide a bit of guidance on this issue, however, there is no concrete plan for this. It would be wise to discuss this plan with @iajoiner and myself to ensure that things are heading in the right direction before building in earnest.
/bounty $10000
/attempt #235
with your implementation plan. Note: we will only assign an issue if you include an implementation plan with a time estimate. Additionally, to be assigned an issue, you must have previously contributed to the project. You can still work on an issue and submit a PR without being assigned./claim #235
in the PR body to claim the bountyThank you for contributing to spaceandtimelabs/sxt-proof-of-sql!
Add a bounty ā¢ Share on socials
Attempt | Started (GMT+0) | Solution |
---|---|---|
š“ @varshith257 | Oct 9, 2024, 1:50:03 PM | WIP |
š“ @animeshd9 | Oct 9, 2024, 8:21:53 PM | WIP |
š¢ @TomBebb | Oct 15, 2024, 10:39:12 PM | WIP |
š¢ @deependujha | Oct 16, 2024, 12:17:06 PM | WIP |
/attempt #235
Algora profile | Completed bounties | Tech | Active attempts | Options |
---|---|---|---|---|
@varshith257 | 15 bounties from 7 projects | Go, Scala, TypeScript & more |
ļ¹239 |
Cancel attempt |
@JayWhite2357 I will connect on discord for the same
/attempt #235
Hi @JayWhite2357 and @iajoiner,I'd like to proceed with the transition from the LALRPOP-based parser to sqlparser for the proof-of-sql-parser crate. I have gone through current implementation and working of proof-of-sql parser. Here's the approach I have in mind:
[ ] I will add the sqlparser crate to handle the parsing of SQL queries.
[ ] Then I move ahead to the replace the LALRPOP Parser
sql.lalrpop
will be replaced by a new parsing function that utilizes sqlparser to parse[ ] ~Next I step to map the sqlparser AST to existing intermediate AST (intermediate_ast::SelectStatement and other structures) to minimize changes to the rest of the crate.This will ensure the current logic built on top of the AST stays intact~ Next use the sqlparser::ast structures entirely, phasing out the current intermediate AST.
[ ] I will thoroughly test the changes incorporate at every stage to ensure all current functionality is maintained.
[ ] I think we can also adapt the existing ParseError
and ParseResult
types to handle errors from sqlparser
Before moving forward with this plan in my mind, I wanted to check in and ensure this approach aligns with the migration goals. If there are any specific concerns or alternative suggestions, Iād be happy to adjust the plan.
I am looking forward to your feedback!
I'm looking into it a bit, and I feel like sqlparser::ast::Query
is thing that would be analogous to intermediate_ast::SelectStatement
.
My initial intent was for the sqlparser AST to replace the intermediate AST. This would mean that QueryExpr::try_new
would accept a sqlparser::ast::Query
(or similar) instead of a intermediate_ast::SelectStatement
. In this situation, most (if not all) of the code changes would be in the proof_of_sql::sql::parse
module.
Only replacing lalrpop with sqlparser, but not replacing intermediate AST is an interesting idea that I hadn't thought of. Perhaps it makes sense as a stepping stone, but I feel like it can't be the end goal here.
@iajoiner might have some feedback on this.
Thanks @JayWhite2357 for your view on this. @iajoiner Any insights on this
Thanks @JayWhite2357 for your view on this. @iajoiner Any insights on this
I chatted with him. He's on the same page. The goal here should be to remove the intermediate AST altogether.
Got it! I just started with basic SELECT parsing logic to see how sqlparser works with an example.
@JayWhite2357 I have joined Discord. If we have a thread in a related channel on discord, we can easily track all progress and discuss more about it in moving forward.
@varshith257 I can chat with you on Discord. What's your handle there?
@varshith257 I can chat with you on Discord. What's your handle there?
@iajoiner I'd: vamshi_257
Cool! Just sent you a message there.
/attempt #235
/attempt #235
@JayWhite2357 Is @iajoiner is on holiday? I am also thinking of connecting with you on discord :
Here's my ID: : vamshi_257
@varshith257 we're all a bit swamped at the moment. I connected with you on discord as well.
The main effort here should be adding a function here: https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/fd3a339fa97ae0e91f4a0c9b65c46924fcc07f7e/crates/proof-of-sql/src/sql/parse/query_expr.rs#L45 The function should look like this:
pub fn try_new_from_sqlparser(
ast: sqlparser::ast::Query,
default_schema: Identifier,
schema_accessor: &dyn SchemaAccessor,
) -> ConversionResult<Self>;
This should be able to be added without changing any existing code. Once it is in place and well tested, we can rename it to try_new
and drop the existing try_new
method. After that, the proof-of-sql-parser
crate will be unused and we can drop it along with related code.
There are probably some complexities here that I'm missing. It may not be as straightforward as this. (For example, since QueryExpr
depends on Identifier
, which is a proof-of-sql-parser
type, we will need to move or replace it before dropping proof-of-sql-parser
completely.)
@varshith257 @TomBebb @deependujha Some possible replacements
proof_of_sql_parser::Identifier
-> sqlparser::ast::Ident
proof_of_sql_parser::ResourceId
-> Vec<Ident>
(we can move ResourceId
into proof-of-sql
and redefine it as [Ident; 2]
if necessary)proof_of_sql_parser::intermediate_ast::BinaryOperator
-> sqlparser::ast::BinaryOperator
proof_of_sql_parser::intermediate_ast::UnaryOperator
-> sqlparser::ast::UnaryOperator
proof_of_sql_parser::intermediate_ast::Expression
-> sqlparser::ast::Expr
proof_of_sql_parser::SelectStatement
-> sqlparser::ast::Query
A few more ideas:
proof_of_sql_parser::utility
needs to be modified to use sqlparser
structs instead.Timestamp(Option<u64>, TimezoneInfo)
. (@Dustin-Ray any insights?)@varshith257 @TomBebb @deependujha Let's get this done ASAP and we will be more than willing to help unblock you guys if necessary. Please feel free to shoot me a message either here or on Discord (username: chloemargaret
) if you guys have any questions.
Sure @iajoiner. Lets aim to complete this ASAP within a few weeks
Just discussed this with @iajoiner some more.
The plan should be to focus on adding the pub fn try_new_from_sqlparser
function.
Once this is added, we can make the appropriate replacements.
This has the benefit of not being a breaking change and not requiring large amounts of refactorization at any point. Instead the swap can be done after the new sqlparser
functionality is well-tested.
This should be able to be added without changing any existing code: specifically, the proof-of-sql-parser
crate should not need to be modified until after the try_new_from_sqlparser
function is completed.
@JayWhite2357 @iajoiner, I have DM you on Discord about a blocker. Have a look
@JayWhite2357 @animeshd9 @deependujha @TomBebb @varshith257 Proposed steps
sqlparser.rs
to the proof-of-sql-parser
crate and adapt a SelectStatement
to sqlparser::ast::Query
. #338 The subtask has been taken by me.proof-of-sql-parser
constructs in proof-of-sql
crate with corresponding sqlparser
constructs. Your work in the previous step should make conversion very easy. Note that sqlparser
construct are mostly strict supersets of their corresponding proof-of-sql-parser
constructs you need to add error handling for what PoSQL doesn't currently support. Here are some suggestions:
Vec<Ident>
/ sqlparser::ast::ObjectName #352proof_of_sql_parser::posql_time
for sqlparser
#351 QueryExpr::try_new
:
pub fn try_new(
ast: sqlparser::ast::Query,
default_schema: sqlparser::ast::Ident,
schema_accessor: &dyn SchemaAccessor,
) -> ConversionResult<Self>;
There should be no construct of proof-of-sql-parser
still used in the proof-of-sql
crate any more.
proof-of-sql-parser
crate and simplify the repo to be a single crate: the proof-of-sql
crate. #356@animeshd9 @TomBebb @varshith257 Since we need proof-of-sql-parser
-> sqlparser
adaptation to make sure we don't add additional use cases of proof-of-sql-parser
constructs in proof-of-sql
crate (in fact #334 already adds additional use cases which complicates this issue) I decided to do the first step myself.
The rest of the subtasks will be added as separated issues linked back here with separate bounties. We will set up deadlines so that you guys know when each task has to be completed before the bounty period ends and we start to work on the task internally. We will usually give a heads up at least a week prior to the deadline for each subtask.
Background and Motivation
Currently, we have an in-house parser that is built on the
lalrpop
parser-generator. This has been good while the supported syntax has been simple. However, as the supported syntax has grown, we need a more comprehensive parser.The
sqlparser
crate is the parser used by DataFusion, which is part of the Arrow ecosystem. It is a feature-rich parser that ultimately will require less code maintenance. It isno_std
compatible, so there should be no issues integrating it.Changes Required
proof-of-sql-parser
usage within theproof-of-sql
crate withsqlparser
usage.proof-of-sql
crate.