microsoft / azuredatastudio

Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases. Azure Data Studio supports Windows, macOS, and Linux, with immediate capability to connect to Azure SQL and SQL Server. Browse the extension library for more database support options including MySQL, PostgreSQL, and MongoDB.
https://learn.microsoft.com/sql/azure-data-studio
MIT License
7.5k stars 882 forks source link

OpenRowset + View (Data Virtualization) #22891

Open albertoRamon opened 1 year ago

albertoRamon commented 1 year ago

Steps to Reproduce: These sentences are 100% valid on Azure Database / SSMS (and return data perfectly), but on Az Data Studio > SQL Project >build return an error:

-- No name of columns
CREATE  VIEW [hubspot].[V_ticket01]
    AS
    SELECT *
        FROM OPENROWSET(
            BULK                        'ticket/ticket01.csv'
            , DATA_SOURCE                = 'DataSource'
            , FORMATFILE                = 'ticket/ticket01.fmt'
            , FORMATFILE_DATA_SOURCE    = 'DataSource'
            , MAXERRORS                    = 100
            , FIELDQUOTE                = '"'
            , FORMAT                    = 'CSV'
        ) AS x;

CREATE  PROCEDURE [hubspot].[SP_ticket04] 
AS
    select Id, P_dealname from [hubspot].[V_ticket04]
GO

The error is: _Build error SQL71501: SqlProcedure: [hubspot].[SP_ticket01] has an unresolved reference to object [hubspot].[V_ticket01].[Id]. Build error SQL71501: SqlProcedure: [hubspot].[SP_ticket01] has an unresolved reference to object [hubspot].[V_ticket01].[Pdealname]

The same with these Attempts

-- Name of columns in View
CREATE  VIEW [hubspot].[V_ticket02] (Id, P_dealname)
    AS
    SELECT *
        FROM OPENROWSET(
            BULK                        'ticket/ticket01.csv'
            , DATA_SOURCE                = 'DataSource'
            , FORMATFILE                = 'ticket/ticket01.fmt'
            , FORMATFILE_DATA_SOURCE    = 'DataSource'
            , MAXERRORS                    = 100
            , FIELDQUOTE                = '"'
            , FORMAT                    = 'CSV'
        ) AS x;

-- Name of columns in Select
CREATE   VIEW [hubspot].[V_ticket03]
    AS
    SELECT Id, P_dealname
        FROM OPENROWSET(
            BULK                        'ticket/ticket01.csv'
            , DATA_SOURCE                = 'DataSource'
            , FORMATFILE                = 'ticket/ticket01.fmt'
            , FORMATFILE_DATA_SOURCE    = 'DataSource'
            , MAXERRORS                    = 100
            , FIELDQUOTE                = '"'
            , FORMAT                    = 'CSV'
        ) AS x;

-- Name of columns in Select & View
CREATE   VIEW [hubspot].[V_ticket04] (Id, P_dealname)
    AS
    SELECT Id, P_dealname
        FROM OPENROWSET(
            BULK                        'ticket/ticket01.csv'
            , DATA_SOURCE                = 'DataSource'
            , FORMATFILE                = 'ticket/ticket01.fmt'
            , FORMATFILE_DATA_SOURCE    = 'DataSource'
            , MAXERRORS                    = 100
            , FIELDQUOTE                = '"'
            , FORMAT                    = 'CSV'
        ) AS x;

PD: I can provide a testing CSV and FMT files, to check that on Azure Database / SQL works fine

Does this issue occur when all extensions are disabled?: Yes

Charles-Gagnon commented 1 year ago

@albertoRamon Is this for a SQL Project? I see references to building something, so just want to make sure. If not could you provide the exact steps for what you're trying to do?

albertoRamon commented 1 year ago

@albertoRamon Is this for a SQL Project? I see references to building something, so just want to make sure. If not could you provide the exact steps for what you're trying to do?

yes @Charles-Gagnon , Fixed

kisantia commented 1 year ago

@albertoRamon can you please check if this also fails in SSDT?

albertoRamon commented 1 year ago

@albertoRamon can you please check if this also fails in SSDT?

@kisantia I dont have Visual Studio, only Az Data Studio

kisantia commented 1 year ago

ok if you have a database with the schema of this project, can you try extracting a .dacpac using the dacpac extension in ADS and see if that's successful or if there are errors? This will help determine if the problem is specific to sql projects or the backend engine DacFx.

albertoRamon commented 1 year ago

ok if you have a database with the schema of this project, can you try extracting a .dacpac using the dacpac extension in ADS and see if that's successful or if there are errors? This will help determine if the problem is specific to sql projects or the backend engine DacFx.

@kisantia No problems: image

kisantia commented 1 year ago

Great, glad that's working! Can you please try one more thing? Try Create project from database to see if that succeeds and if that project builds.

Also, if you can share steps on how to setup a database and project that repros this, I can investigate this next week. My email is kisantia@microsoft.com.

albertoRamon commented 1 year ago

In SQL the sentences works perfectly, return rows: image image

My Guess: On SQL when run the Create View read and process the mft file (file format) (I'm sure because if you have some typo in the mft, the create view will fail)

Compile the View in Az Data Studio doesn't process the fmt file, thus don't know the name of the columns

-- ERROR
CREATE  PROCEDURE [hubspot].[SP_ticket04] 
AS
    select Id, P_dealname from [hubspot].[V_ticket04]
GO

-- OK
CREATE  PROCEDURE [hubspot].[SP_ticket04] 
AS
    select * from [hubspot].[V_ticket04]
GO

I can "help" to Az Data studio on create view with V_ticket02/3/4 but need accept my Help :P (with Merge command I found similar Issue and if you put the list of columns even in SQL / SSMS is not necessary the Database project compile perfect)

kisantia commented 1 year ago

@albertoRamon if you can please share steps on how to setup a database and project that repros this, I can investigate this week. My email is kisantia@microsoft.com.

As a workaround in the meantime, you can try putting the script of the stored proc in a post deployment script, since those aren't included in the model validation.

albertoRamon commented 1 year ago

@kisantia Can you check your email? I already sent (I think)

kisantia commented 1 year ago

Found the email in my junk folder! I looked this and believe the current behavior is by design. If the columns are specified in the view, then the project builds without any errors.

SQL Server supports deferred name resolution, which means that stored procedures can be created without verifying that the referenced objects/columns exist and this is why the script is valid tsql when run on the database. DacFx, the engine that builds sql projects, runs validation on the database model to ensure that the model is valid and that all the referenced objects exist. In this case, it does not support discovering the columns in the view from the files in blob storage (or wherever they are located) and then the build fails because those columns are not explicitly defined in the view.

albertoRamon commented 1 year ago

Hello @kisantia Have sense your explanation about lazy evaluation Some idea how to deal with it in Az Data studio?

Im not able to build the project without errors (defining the view & and using it from SP)

image

image select * from View works OK select col1 from View return an error on Az Data studio > database > project Build

kisantia commented 1 year ago

Hey @albertoRamon, I'm not too familiar with the syntax for OPENROWSET, so not sure how much help I can be. I tried a few different combinations to try to get the project to build and this built it successfully, just with a couple warnings that can be ignored (note: I don't know if this actually works as expected, just that it builds without errors and creates the dacpac).

CREATE  VIEW [hubspot].[V_ticket01] 
    AS
    SELECT Id, P_dealname
        FROM OPENROWSET(
            BULK                    'ticket/ticket01.csv'
            , FORMATFILE                = 'ticket/ticket01.fmt'
            , MAXERRORS             = 100
            , FIELDQUOTE                = '"'
            , FORMAT                = 'CSV'
        ) WITH (Id VARCHAR(10), P_dealname VARCHAR(100)) AS x;
GO

CREATE PROCEDURE [hubspot].[SP_ticket01] 
AS
    select x.Id, x.P_dealname from [hubspot].[V_ticket01]
GO
albertoRamon commented 1 year ago

Hello @kisantia unfortunately the option With (col1 type1, col2) is only available for Synapse Serverless
This option is not in OpenRowset "normal", thus, I is not available for : SQL Server (0n-site) / Az Database / Az SQL MI

On Az SQL Server: Error image

(I dont have clear why is only in Az Synapse ServerLess)

albertoRamon commented 12 months ago

If you define the view as bellow will compile on Az Data Studio:

create view  XX (col1)
AS
   Select col1  from empty_table
   UNON ALL
   Select col1 from OPENROWSET

The columns data type of empty_table must match with FMT file of OPENROWSET