modeldba / sql-surveyor

High-level SQL parser. Identify tables, columns, aliases and more from your SQL script in one easy to consume object. Supports PostgreSQL, MySQL, SQL Server and Oracle (PL/SQL) dialects.
https://modeldba.com/sql-surveyor
MIT License
27 stars 2 forks source link

Parsing PLSQL store procs #4

Open sfeher-p opened 3 years ago

sfeher-p commented 3 years ago

Hi, great library! QQ on parsing with SQLDialect.PLSQL - is there any specific setup needed to detect STORED_PROCEDURE entries ? I can see the lines with store procs skipped, but they appear at the end of the tokens array, including ';'.

mtriff commented 3 years ago

Thanks! I haven't tested with STORED_PROCEDURE entries, but happy to add that functionality. Some questions for you:

  1. Can you give an example of the SQL that you're trying to parse?
  2. What would you like SQL Surveyor to output for STORED_PROCEDURE entries?
  3. Are there specific attributes that you would like captured/easily accessible in the output object?
sfeher-p commented 3 years ago

Thanks for your quick reply!

  1. I have a few rather large SQL scripts I'm trying to analyze, and they consists of regular SQL ops and store proc calls. This is a small sample.
    
    /* *********** Test Script - <Name> *********** */

DECLARE

/* CHANGES TO BE MADE FOR DEPLOYMENT: 
*/
tenantName                  VARCHAR2 (32) := 'Test 1';
tenant                      VARCHAR2 (32) := 'XYZ1';
tenantCode                  VARCHAR2 (32) := '00123456';
country                     VARCHAR2 (32) := 'US';

groupId                     NUMBER := NULL;
admin_groupId               NUMBER := NULL;
userId                      NUMBER := NULL;
configID                    NUMBER := NULL;
shift                       DATE := NULL;
pmId                        NUMBER := NULL;
consumer                    CONSTANT VARCHAR2 (32) := 'CONSUMER';
typeId_1                    NUMBER := NULL;

/* GQL TEST */
testProcCode        VARCHAR2 (32) := 'TEST_GRAPHQL';
proc01              VARCHAR2 (32) := testProcCode;
proc02              VARCHAR2 (32) := testProcCode;
proc03              VARCHAR2 (32) := testProcCode;

BEGIN

-- Set cutoff time to 8pm.
shift := TO_DATE ((TO_CHAR (TRUNC (SYSDATE, 'D'), 'YYYYMMDD') || '200000'), 'YYYYMMDDHH24MISS');

/**/

/**** Set up configuration ****/

cfg_operations.setup1(tenantCode, 'all', 'setup.url', 'https://css.tenant_name.org/', 'en');

INSERT INTO config_c2nt (cCode, description, name3, createdBy, modifiedBy)
    VALUES (tenantCode, tenantName, tenant, 'default', 'default');

INSERT INTO config_c2nt_config (cCode, mla1, ccd3, tz2,createdBy, modifiedBy)
    VALUES (tenantCode, 5, country, tzID2, 'default', 'default');

/**/

/**** User and Group configuration ****/

-- Set up the level group.
INSERT INTO cfg_group (groupID, cCode, description, groupType, createdBy, modifiedBy)
    VALUES (cfg_group_seq.NEXTVAL, tenantCode, 'System Admins', 'TENANT_ADMIN', 'default', 'default')
    RETURNING GROUP_ID INTO admin_groupId;

-- Set up standard platform user in the level group.
INSERT INTO cfg_user (userID, groupID, cCode, ulogin, upwd, utype, createdBy, modifiedBy)
    VALUES (cfg_user_seq.NEXTVAL, admin_groupId, tenantCode, 'platform_co', 'kjsadflasfhlasfdjsald=', 'PLATFORM_TEST_TENANT_ADMIN', 'default', 'default')
    RETURNING userID INTO userId;
INSERT INTO cfg_user_profile (userID, first_name, last_name, email, dp4, dp4Ext, createdBy, modifiedBy)
    VALUES (userId, 'PlatformCo', 'Admin', null, null, null, 'default', 'default');     

-- Set up the tenant admin group.   
INSERT INTO cfg_group (groupID, cCode, parent_groupID, description, groupType, createdBy, modifiedBy)
    VALUES (cfg_group_seq.NEXTVAL, tenantCode, admin_groupId, 'Tenant Admins', 'TENANT_ADMIN', 'default', 'default')
    RETURNING GROUP_ID INTO admin_groupId;      

-- Set up standard perms for the tenant admin group.

/* CHANGES TO BE MADE FOR DEPLOYMENT: 
*   1. If the tenant requires configs.
*/

-- User management perms.
INSERT INTO platfs5_group_perm (groupID, perm_code, enabled_flag, active_flag, createdBy, modifiedBy) 
    VALUES (admin_groupId, 'user_menu', 'Y', 'Y', tenantCode, tenantCode);                          --User Management Menu
INSERT INTO platfs5_group_perm (groupID, perm_code, enabled_flag, active_flag, createdBy, modifiedBy)
    VALUES (admin_groupId, 'user_search', 'Y', 'Y', tenantCode, tenantCode);                        --Search Users
INSERT INTO platfs5_group_perm (groupID, perm_code, enabled_flag, active_flag, createdBy, modifiedBy)
    VALUES (admin_groupId, 'user_add', 'Y', 'Y', tenantCode, tenantCode);                           --Add Users

--Test perms.
INSERT INTO platfs5_group_perm (groupID, perm_code, enabled_flag, active_flag, createdBy, modifiedBy)
    VALUES (admin_groupId, 'my_menu', 'Y', 'Y', tenantCode, tenantCode);                            --Features Menu
INSERT INTO platfs5_group_perm (groupID, perm_code, enabled_flag, active_flag, createdBy, modifiedBy)
    VALUES (admin_groupId, 'test_search', 'Y', 'Y', tenantCode, tenantCode);                        --Search  

/**/

/**** General feature configuration ****/

-- Set up the main feature type for utility accounts.
-- NOTE account-required-flag set to 'Y' since we will be authenticating utility accounts against a RT API or CIF file.
-- Supports Itineris features for utility accounts, authenticated via CIF, for ROTP token, autoaction (we process schedules), and IVR.
INSERT INTO papi5cfg_feature_type
    (feature_type_id, cCode, feature_type_code, feature_type_description, createdBy, modifiedBy,
        account_required_flag, invoice_required_flag, order_by, sale_description_msg, cx_segment_type_code)
    VALUES (papi5cfg_feature_type_seq.nextval, tenantCode, 'UB', 'Customer Doc', 'default', 'default', 
        'Y', 'N', 1, 'Customer Doc', consumer)
    RETURNING feature_type_id INTO typeId_1;

/**/

/* CHANGES TO BE MADE FOR DEPLOYMENT: 
*/

INSERT INTO papi5cfg_feature_proc_config
    (feature_proc_config_id, feature_proc_code, config_name, config_value, createdBy, modifiedBy)
VALUES (papi5cfg_feature_proc_config_seq.nextval, testProcCode, 'SERVICE_URL', 'https://features.sandbox.braintree-api.com/graphql', tenantCode, tenantCode)
    returning feature_proc_config_id INTO configID;

INSERT INTO papi5cfg_feature_proc_config
    (feature_proc_config_id, feature_proc_code, config_name, config_value, createdBy, modifiedBy)
VALUES (configID, testProcCode, 'CUSTOMER_ID', 'PlatformCocorporation', tenantCode, tenantCode);

INSERT INTO papi5cfg_feature_proc_config
    (feature_proc_config_id, feature_proc_code, config_name, config_value, createdBy, modifiedBy)
VALUES (configID, testProcCode, 'CRYPTIZATION_KEY', 'asjkhasldkfahslfdhalskf', tenantCode, tenantCode);

INSERT INTO papi5cfg_feature_proc_config
    (feature_proc_config_id, feature_proc_code, config_name, config_value, createdBy, modifiedBy)
VALUES (configID, testProcCode, 'API_KEY', 'asdhflashfjalsfasldfsafd==', tenantCode, tenantCode);

INSERT INTO papi5cfg_feature_proc_config
    (feature_proc_config_id, feature_proc_code, config_name, config_value, createdBy, modifiedBy)
VALUES (configID, testProcCode, 'FORWARDING_API', 'https://url', tenantCode, tenantCode);

/**/

/**** Configuration options ****/

/* PAPI configuration - Turn on PAPI with login required. */
platfs5_config_operations.setClientOption(tenantCode, 'PAPI_SUPPORTED', 'true');
platfs5_config_operations.setClientOption(tenantCode, 'PAPI_LOGIN_SUPPORTED', 'true');

/* Customer portal configuration */ 
platfs5_config_operations.setClientOption(tenantCode, 'CREATE_USER', 'true');
platfs5_config_operations.setClientOption(tenantCode, 'NOTIF_ADDR', 'http://kjfhska');
platfs5_config_operations.setClientOption(tenantCode, 'SCHEDULE_SOMETHING', 'false');

/* AUDO configuration */
platfs5_config_operations.setClientOption(tenantCode, 'AUDO_REENTER_E2', 'true');

/**/

/**** Action messages for GUI ****/

/**/

/**** Action messages for batch ****/

/* CHANGES TO BE MADE FOR DEPLOYMENT: 
*/

platfs5_config_operations.setActionMessage(tenantCode, 'all', 'expiredMsg.url.text', 'https://css.tenant_name.org/', 'en');
platfs5_config_operations.setActionMessage(tenantCode, 'all', 'expiredMsg.url.link', 'https://css.tenant_name.org/', 'en');

/* Action messages */
platfs5_config_operations.setActionMessage(tenantCode, 'tenantaction-web', 'sdfgsdf.amdStatus.label', 'sdfgsd Status', 'en');
platfs5_config_operations.setActionMessage(tenantCode, 'tenantaction-web', 'gsdfgsdfg.sample.url', 'cm-audience-sample-V12.csv', 'en'); --New Sample parser
platfs5_config_operations.setactionmessage(tenantCode, 'out-server', 'asfasd.msg', 'ecm-action-now-press-1', 'en');

/**/

COMMIT; END;



2. I don't have a preference but probably the package, procedure name, parameters, and the usual tokens representation would be good.

3. For my current use case I'm trying to do a bit of statistical analysis in the first place and see which tables, what ops, what fields and field values are being used more frequently from the many setup scripts I am trying to analyze. To support that, most probably I will have to go through the tokens and match the INSERT's or UPDATE's referencedColumns with the token's appropriate values. Same for the store procs. 
I thought it would be good to automate it and here your module does a great job supporting the metadata extraction.

Thanks,
Sebastian
sfeher-p commented 3 years ago

Hey, qq on handling queries and keywords. I see in some cases, similar to the store procedure case that for some keywords it will not recognize it as a query. In case it is not recognized would it be possible to create a new top level entry for such queries that could not be parsed and add the tokens there ? This would be very helpful, otherwise I will have to split the tokens' array on ';' and do that separately.

As an example, this parses the first INSERT query but will not detect the select * from test_type inside the for loop, and the previous ';' and 'for' will appear in tokens:

`

declare

    tenantCode varchar2(32) := '12345';

    permCode    test_permission.permission_code%type;
    permDesc    test_permission.description%type;
    orderBy     test_permission.order_by%type := 1001;
    categoryDescription test_permission.category_description%type := 'types';
    categoryName test_permission.category_description%type := 'type Permission';

begin
    insert into test_permission 
                (tenant_code, permission_code, description, category_description, order_by, depth_level,
                 active_flag, created_on, created_by, modified_on, modified_by) 
         values (tenantCode, 'types_'||tenantCode, categoryName, categoryDescription, orderBy, 1,
                 pec_constants.boolYes, sysdate, clientCode, sysdate, clientCode);

for pt in (select * from test_type pt
      where pt.tenant_code = clientCode and pt.active_flag = 'Y'
      order by pt.order_by) loop
     permCode := 'type_'||pt.type_id;
     permDesc := pt.type_description;
     if pt.cx_segment_type_code <> pec_constants.CONSUMER_SEGMENT then
        permDesc := permDesc || ' (Business)';
     end if;
     orderBy := orderBy + 1;

    insert into test_permission 
                (tenant_code, permission_code, description, category_description, order_by, depth_level,
                 active_flag, created_on, created_by, modified_on, modified_by) 
         values (tenantCode, permCode, permDesc, categoryDescription, orderBy, 2,
                 pec_constants.boolYes, sysdate, tenantCode, sysdate, tenantCode);

end loop;
commit;
end;

`

Let me know what you think.

Thanks, Sebastian

mtriff commented 3 years ago

I've implemented parsing for stored procedure calls and control flow statements like FOR. The block you provided is now parsed as one big query (from declare to end) and all the nested statements are available using the subquery attribute on the ParsedQuery object.

I haven't fully tested this and I will need to implement it for other SQL dialects as well. For now, I've released this functionality as an alpha build. You can install it using the following command:

npm install sql-surveyor@alpha

Give it a try and let me know if this has everything you need. 🤞

EDIT: I forgot to add, for stored procedures and function calls, you can use the new routineName attribute on ParsedQuery objects, the parameters are stored in a new parameters attribute. Since packages are PL/SQL-only feature, I haven't parsed them out to their own attribute.