jehugaleahsa / SQLGeneration

Provides core classes for generating SQL at runtime.
The Unlicense
22 stars 10 forks source link

Problems using Parameters in a SQL string being parsed.. #8

Closed dazinator closed 10 years ago

dazinator commented 10 years ago

I am attempting to identify the presence of parameters in the SQL string and have hit some problems!

If you start with this SQL:-

Select contactid, firstname, lastname From contact Where firstname = @param1
CommandBuilder commandBuilder = new CommandBuilder();
SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);

The builder will now contain an EqualToFilter. The left and right side of the filter are of type "Column" (As an aside, shouldn't the right side really be a Placeholder?)

So, in order to detect the presence of a parameter in this query, I am checking the Column name to see if it startes with an "@" - if it does I know its not really a column, its actually a parameter. This isn't the best, but it works so i am happy.

...until I hit the following scenario:-

Select contactid, firstname, lastname From contact Where firstname Like @param1
CommandBuilder commandBuilder = new CommandBuilder();
SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);

The LikeFilter expects the right side to be a Literal, so in this case this SQL won't actually parse (The LikeFilter is lost).

Therefore I am not really sure how to proceed in this scenario..

I considered the idea that of changing the SQL so that the paramater was a string literal so that it woul d parse:-

Select contactid, firstname, lastname From contact Where firstname LIKE '@param1'

But I don't like this for 2 reasons:-

  1. Its inconsistent. In all other cases, the parameter would be written as @param1 not '@param1'. This feels like fudging it for the case of the LikeFilter.
  2. Could get false positives checking to see if the string literal was a parameter. For example, imagine the column was a username column and people had usernames like @Joey - in that case a legitmate SQL string might contain a where clause such as WHERE username LIKE '@Joey%' . However I would interpret '@Joey%' as a parameter in that circumstance because it starts with an @ when it isn't - it's a sting literal.

I am sooo close now in terms of having a fully functional ado.net provider for dynamics crm.. I would be really greatful for any suggestions you can offer regarding this issue.

Cheers.

jehugaleahsa commented 10 years ago

You actually discovered the fundamental flaw I ran into with this library. I couldn't tell whether a name represented a column, an alias (AS), a parameter or a placeholder. For T-SQL, @ obviously means parameter, but that's not true for other SQL dialects. Without some knowledge of the database schema I can only guess that a name represents a column. Perhaps placeholder would make a better catch-all, but that has its own problems.

As for LIKE, I think I assumed the right hand side would always be a literal. Come to think of it I have passed parameters to like clauses in the past. The MSDN documentation says one thing but my experience says otherwise. A change would be needed to handle either/or. On Mar 16, 2014 2:34 PM, "Darrell" notifications@github.com wrote:

I am attempting to identify the presence of parameters in the SQL string and have hit some problems!

If you start with this SQL:-

Select contactid, firstname, lastname From contact Where firstname = @param1

CommandBuilder commandBuilder = new CommandBuilder();SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);

The builder will now contain an EqualToFilter. The left and right side of the filter are of type "Column" (As an aside, shouldn't the right side really be a Placeholder?)

So, in order to detect the presence of a parameter in this query, I am checking the Column name to see if it startes with an "@" - if it does I know its not really a column, its actually a parameter. This isn't the best, but it works so i am happy.

...until I hit the following scenario:-

Select contactid, firstname, lastname From contact Where firstname Like @param1

CommandBuilder commandBuilder = new CommandBuilder();SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);

The LikeFilter expects the right side to be a Literal, so in this case this SQL won't actually parse (The LikeFilter is lost).

Therefore I am not really sure how to proceed in this scenario..

I know that I could possibly change the SQL to:-

Select contactid, firstname, lastname From contact Where firstname LIKE '@param1'

But then that would mean inconsistent use of parameters as sometimes it would be used in single quotes, and other times not.

I would be really greatful for any suggestions regarding this :)

Reply to this email directly or view it on GitHubhttps://github.com/jehugaleahsa/SQLGeneration/issues/8 .

dazinator commented 10 years ago

I have looked into this a little bit more,

It seems the most common parameter placeholders tend to be:

Would it be viable to have the parser take this as some sort of option prior to parsing. It would then be able to interpret paramaters in the string correctly to detect placeholders.

For example:-

CommandBuilder commandBuilder = new CommandBuilder();
commandBuilder.PlaceholderMatch = "@";
SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);
jehugaleahsa commented 10 years ago

I had the same thought. On Mar 18, 2014 11:10 AM, "Darrell" notifications@github.com wrote:

I have looked into this a little bit more,

It seems the most common parameter placeholders tend to be:

  • ms sql server = @
  • oracle = :
  • oledb, mysql = ?

Would it be viable to have the parser take this as some sort of option prior to parsing. It would then be able to interpret paramaters in the string correctly to detect placeholders.

For example:-

CommandBuilder commandBuilder = new CommandBuilder();commandBuilder.PlaceholderMatch = "@";SelectBuilder builder = (SelectBuilder)commandBuilder.GetCommand(commandText);

Reply to this email directly or view it on GitHubhttps://github.com/jehugaleahsa/SQLGeneration/issues/8#issuecomment-37944070 .

jehugaleahsa commented 10 years ago

I am about to commit some new changes that will hopefully resolve this issue. Below is a test that is passing:

    /// <summary>
    /// This sees whether we can reproduce a select with a LIKE filter
    /// that sees if a column is like a parameter.
    /// </summary>
    [TestMethod]
    public void TestSelect_LikeFilter_CompareToParameter()
    {
        string commandText = "SELECT Column FROM Table WHERE Column1 LIKE @Parameter";
        CommandBuilderOptions options = new CommandBuilderOptions()
        {
            PlaceholderPrefix = "@"
        };
        assertCanReproduce(commandText, options);
    }

To get this to work, I had to change LIKE to accept anything on the right-hand side. It occurred to me that the right-hand side could be a parameter, a function call, or just string literals being concatenated together. With so many possibilities, I just opened it up to be anything. My philosophy in this project has been: It is better to allow invalid SQL than it is to be prevent valid SQL.

That still only solved half of our problem, though. The code was stupidly (stubbornly) treating the parameter as a weirdly-named column. As you can see, I provide a new CommandBuilderOptions class with a single property PlaceholderPrefix, which you can set to your parameter symbol (@). Now if the parser encounters an identifier it does these tests:

  1. If the identifier includes multiple parts, it is a column (e.g., MySchema.MyTable.MyColumn).
  2. If the identifier starts with the placeholder prefix, it is a placeholder.
  3. If the query only has one source (i.e., table) treat the identifier as a column of that source.
  4. Otherwise, just make it a placeholder to make sure the identifier isn't altered later on.

If I knew the schema of the sources (i.e., tables), I could expand upon step 3 by looking to see if the identifier appeared in any of the sources. If it appeared in more than one source, I'd throw an ambiguity error. If it only appeared in one, I would choose that source. Otherwise, I fall back on placeholders because I have no clue what that identifier represents. Things get a little hairy when you start thinking about sub-selects, aliases and other nasty SQL features. Defining a schema for each source would be difficult anyway, since a source could be a table-valued function or something equally odd.

The benefit to falling back on placeholders is that whoever is using the library can just force using a Column by prefixing it with the table name in the SQL. I think most of the time, developers using the library won't care whether they are working with a Column or a Placeholder. I'd expect the normal usage to be to add columns or filters to a base SQL.

dazinator commented 10 years ago

This is great, this has basically now resolved all my unit tests regarding parameters so I am a happy chappy. Cheers.