mbleron / ExcelTable

ExcelTable is an Oracle SQL interface to read Microsoft Excel files (.xlsx, .xlsm, .xlsb, .xls, .xml), ODF spreadsheet files (.ods) and flat files as external tables
MIT License
45 stars 12 forks source link

Wrong results when joining a view based on ExcelTable #48

Closed adrianboangiu closed 10 months ago

adrianboangiu commented 1 year ago

Hello, I was using extensively ExcelTable to manipulate some text files as Oracle views. I discover recently that performing a join with such view does not work always as expected. I don’t know if I face a limit of ExcelTable or some bug (in ExcelTable or maybe in Oracle). I work with Oracle 19.3.0 and ExcelTable 5.0 version. The database nls_length_semantics. Hereafter, an example to understand the problem. I have defined the following view based on the file messages.txt placed into a folder named TRAD (the zip file attached contains the definition of all the objects and the files necessary to create the test case) excel_table.zip

CREATE OR REPLACE VIEW V_MESSAGE
AS
   WITH EXCEL_FILE
      AS(
         SELECT t.*
           FROM TABLE(
                   ExcelTable.getRows(
                      p_file      =>(
                                       SELECT ExcelTable.getTextFile(
                                                 p_directory => 'TRAD'
                                               , p_filename  => 'messages.txt' )
                                         FROM DUAL
                                    )
                    , p_cols      => q'{
  "BUNDLE"  VARCHAR2( 100 )
, "IDX"     VARCHAR2( 10 )
, "KEY"     VARCHAR2( 100 )
, "LABEL"   VARCHAR2( 100 )
, "MESSAGE" VARCHAR2( 1000 )
}'
                    , p_skip      => 0
                    , p_line_term => CHR( 10 )
                    , p_field_sep => CHR( 9 )
                   )
                )t
      )
   SELECT BUNDLE
        , IDX
        , KEY
        , LABEL
        , MESSAGE
     FROM EXCEL_FILE e
WITH READ ONLY CONSTRAINT RO_V_MESSAGE;

The resulting view columns are: image

I was a little bit surprised to see that the column sizes are expressed in BYTE. I expected sizes 4 times smaller without BYTE/CHAR or with CHAR. The view has around 15900 rows and is working well. Hereafter, a query on the view and its result (that will help to show afterwards the problem encountered).

select *
  from V_MESSAGE
 where BUNDLE = 'MenuStructureBundle'
 order by KEY asc;

The results are correct.

image

I have problems when I join the view V_MESSAGE with another one, V_MENU_DATA. This view is presenting data from the table REM_MODULE in a hierarchical way. Here are the columns of the V_MENU_DATA view:

image

and those of the table REM_MODULE:

image

The V_MENU_DATA view has around 3600 rows and is working well. Hereafter, a query on the view and its result (that will help to show afterwards the problem encountered).

select *
  from V_MENU_DATA
 order by MENU_NAME asc;

image

And now the query performing the join (and a variant of it) that does not work:

SELECT mdata.MENU_NAME
     , mdata.MENU_TYPE
     , mdata.PARENT_MENU
     , mdata.SYSTEM_MENU
     , mdata.MENU_LEVEL
     , mes.MESSAGE 
  FROM V_MENU_DATA mdata
     , V_MESSAGE mes 
 WHERE mdata.MENU_NAME = mes.KEY
   AND BUNDLE = 'MenuStructureBundle';

Hereafter the result. The results are obviously wrong for the message column (coming from the view based on ExcelTable). 3 times the same value throughout the whole result set!!!

image

I obtain the same result if I write the query a little bit different:
SELECT mdata.MENU_NAME
     , mdata.MENU_TYPE
     , mdata.PARENT_MENU
     , mdata.SYSTEM_MENU
     , mdata.MENU_LEVEL
     , mes.MESSAGE 
  FROM V_MENU_DATA mdata
     , ( select KEY, MESSAGE from V_MESSAGE where BUNDLE = 'MenuStructureBundle' ) mes 
 WHERE mdata.MENU_NAME = mes.KEY;

I observed that the results are correct if I join the V_MESSAGE view with the REM_MODULE table

SELECT mdata.CWMO_CODE
     , mdata.CWMO_TYPE
     , mdata.CWMO_PARENT
     , mes.MESSAGE 
  FROM REM_MODULE mdata
     , V_MESSAGE mes 
 WHERE mdata.CWMO_CODE = mes.KEY
   AND BUNDLE = 'MenuStructureBundle'
ORDER BY 1;

image

Am I doing something wrong ? Can you help me to overcome the problem?

Thank you very much, Adrian Boangiu

mbleron commented 1 year ago

Hi Adrian,

First off, thanks for the very detailed test case, much appreciated.

I was a little bit surprised to see that the column sizes are expressed in BYTE. I expected sizes 4 times smaller without BYTE/CHAR or with CHAR.

That's probably because of your session NLS_LENGTH_SEMANTICS when creating the view. select value from nls_session_parameters where parameter = 'NLS_LENGTH_SEMANTICS';

If it's CHAR, before loading the data dictionary, Oracle first converts the specified length to BYTE unit using the max byte size of the DB charset (most likely AL32UTF8) as factor.

About the main issue,

Am I doing something wrong ? Can you help me to overcome the problem?

I think you're hitting yet another optimizer bug related to ODCI data source. Very often, preventing view merging solves the problem :

SELECT /*+ no_merge(mes) */ 
       mdata.MENU_NAME
     , mdata.MENU_TYPE
     , mdata.PARENT_MENU
     , mdata.SYSTEM_MENU
     , mdata.MENU_LEVEL
     , mes.MESSAGE 
  FROM V_MENU_DATA mdata
     , V_MESSAGE mes 
 WHERE mdata.MENU_NAME = mes.KEY
   AND mes.BUNDLE = 'MenuStructureBundle';
mbleron commented 10 months ago

Closing the issue.