sqlparser / gsp_demo

General SQL Parser is a Java/.NET library. It provides a rich set of APIs to parse, decode, analyze and rewrite SQL scripts. Supports more than 10 major database platforms. This repo provides demos and testcases to help people make better use of the General SQL Parser
http://www.sqlparser.com
23 stars 12 forks source link

Enhancement - SQLFlow JSON Output: Include information from INSERT and DELETE statements in procedures. #8

Open fprima opened 5 years ago

fprima commented 5 years ago

We consider this as lineage because in some cases we may have store procedures that produce a full refresh of data. That is doing a Truncate/Delete followed by and Insert. For these type of store procedures, we will have no lineage from the JSON. We have seen in the SQLFlow demo in the Sample SQl that relationships are included in the JSON for INSERT statements. That is why we were wondering why they are not included in the store proc JSON.

sqlparser commented 5 years ago

@fprima Is it possible to post a sample stored proc?

fprima commented 5 years ago

ProcedureExample.txt

fprima commented 5 years ago

@sqlparser above please find a sample procedure.

sqlparser commented 5 years ago

@cnfree add procedure in the dbobjs section of the JSON output in SQLFlow output:

            {
                "arguments": [
                    {
                        "coordinates": [
                            {
                                "x": 8,
                                "y": 24
                            },
                            {
                                "x": 8,
                                "y": 34
                            }
                        ],
                        "id": "2",
                        "name": "pv_num1",
                        "datatype":"number",
                        "inout":"in"
                    },
                    {
                        "coordinates": [
                            {
                                "x": 8,
                                "y": 36
                            },
                            {
                                "x": 8,
                                "y": 40
                            }
                        ],
                        "id": "4",
                        "name": "pv_num2",
                        "datatype":"number",
                        "inout":"inout"
                    }
                ],
                "coordinates": [
                    {
                        "x": 8,
                        "y": 15
                    },
                    {
                        "x": 8,
                        "y": 22
                    }
                ],
                "id": "1",
                "name": "student_department_mod",
                "type": "procedure"
            }
sqlparser commented 5 years ago

@fprima what's kind of information do you need in the JSON output for delete and update statement in the procedure.

create or replace procedure student_department_mod(pv_num1 in number,pv_num2 in out number)
is
   lr_address      address%rowtype;

begin
  --insert
  insert into STUDENT (STUDENT_ID, NAME, AGE)
  values (9998646, 'JON', 34);

  -- select
  select *  into lr_address from ADDRESS where STUDENT_ID  = pv_num2;

   --update
  update DEPARTMENT
  set modified_date = sysdate
  where department_id  = 4;   

  --delete
  delete from EMPLOYEE where employee_id  = 9998;
end;
cnfree commented 5 years ago

@sqlparser
Implemented it.

fprima commented 5 years ago

@cnfree thanks for the update.

I checked the JSON output and see some issues. For example in the update statement we have "set modified_date = sysdate" and in the output I see Target: "column": "modified_date" and Source: "column": "modified_date" instead of sys date. Also under dbobs I see branches for the procedure name/parameters, tables (STUDENT and ADDRESS), RESULT_OF_SELECT-QUERY, UPDATE-SET, but I do not see a branch for INSERT and DELETE. Also I was expecting an entry to the target table 'lr_address' and relations entry for 'where STUDENT_ID = pv_num2'.

Franco

fprima commented 5 years ago

@cnfree another item is for example In the JSON below, "clauseType" is missing for the STUDENT_ID column although type is 'frd'. It should have "clauseType = where" in source "id": "5_0", "type": "frd", "effectType": "select", "target": { "id": "15_0", "column": "STUDENT_ID", "parentId": "14", "parentName": "RESULT_OF_SELECT-QUERY", "coordinates": [ { "x": 12, "y": 10 }, { "x": 12, "y": 11 } ] }, "sources": [ { "id": "13", "column": "STUDENT_ID", "parentId": "11", "parentName": "ADDRESS", "coordinates": [ { "x": 12, "y": 48 }, { "x": 12, "y": 58 } ] } ]

fprima commented 5 years ago

@cnfree and lastly, were this changes committed to the package that we can download?

cnfree commented 5 years ago

@fprima The clauseType property is just used by join relation, but not frd relation.

cnfree commented 5 years ago

@cnfree and lastly, were this changes committed to the package that we can download?

@sqlparser please answer for this question.

sqlparser commented 5 years ago

Please update the DataFlowAnalyzer.java to support procedure input/output parameter. @cnfree

cnfree commented 5 years ago

@fprima please checkout https://github.com/sqlparser/gsp_demo and run DataFlowAnalyzer.java.

And the new version handled the sysdate keyword as a constant.

jxcypress commented 5 years ago

去公司官网看了下,东西挺好用,想买个资料无奈官网一个电话都没留,邮件也没人回复