PropicSignifi / Query.apex

A dynamic SOQL and SOSL query builder on Salesforce.com platform
https://propicsignifi.github.io/query-apex/
MIT License
153 stars 68 forks source link

Directly set Condition.conditionString #67

Closed aranwe closed 3 months ago

aranwe commented 3 years ago

Hi,

First - Query.apex is awesome, however I have now pretty specific usecase:

I need to set the conditionString of Condition directly - i.e. I have configuration object where there is manually entered part of the WHERE Clause (that is then appended with doAnd() call).

I have two possible solutions, I can send PR - just would like to know your throughts 1) just make the Condition constructor public with 1 String param - i.e.:

public Condition(String conditionString) {
    this();
    this.conditionString = conditionString;
}

2) Add methods addConditionString and conditionString to the Query class - i.e.:

public Query addConditionString(String conditionString){
    return addCondition(conditionString(conditionString));
}
public static Condition conditionString(String conditionString){
   return new Condition(conditionString);
}

And in the Condition class

private Condition(String conditionString){
    this();
    this.conditionString = conditionString;
}

What do you think?

HenryRLee commented 3 years ago

Hi @aranwe, thanks for your support. It's very much appreciated.

Both the proposed solutions look good to me from a high-level perspective. (I suppose the second solution is the one you really want, so go for it). However, when it comes to detailed implementation, we need to be careful about SQL injections. Because the condition string may come from user input, and we don't know what the input potentially looks like.

For example, given this query:

String conditionString = '...'; // User input, can be anything

new Query('Account')
.addConditionLike('Name', 'Jack %')
.addConditionString(conditionString)
.run();

The programmer is expecting the result should at least satisfy the first condition, and then satisfy whatever the second condition is. However, if the user chooses this condition string:

String conditionString = ' Id != null) OR (Id != null';

Eventually, the whole query string would look like this:

SELECT id FROM Account WHERE (Name LIKE 'Jack %') AND (Id != null) OR (Id != null)

This is a constant true WHERE clause. So the user just managed to get around the first condition restriction, which is definitely not what the programmer wanted.

We can prevent this from happening is by disallowing users to use closing brackets in the condition string. For example, you may get rid of all closing brackets in the constructor:

public Condition(String conditionString) {
    this();
    this.conditionString = conditionString.replaceAll('\\)', '');
}

However, this may also wipe out some closing brackets inside a string literal, if there is any. For example, if the user gives us Name LIKE '(Room 123)', you may still want to preserve the closing bracket because it's not intended to perform any injection. In the end, your regex in the replaceAll should be more complicated. I'll leave this as an exercise for you.

aranwe commented 3 years ago

Hi @HenryRLee ,

yes, this definitely opens up SQL injection, fyi Query.apex is already filled with these:

Example:

Query query = new Query('Account').
    selectFields('Name').
    addConditionLike('Id = \'null\') OR (Name = \'ABC\') OR (Id =',  //LHS
        'null) OR (Name = \'DEF'). //RHS
    debug();

// debug: SELECT name FROM Account WHERE (Id = 'null') OR (Name = 'ABC') OR (Id = LIKE 'null) OR (Name = 'CDE')

(However I understand that its Query builder and its not intended for human input) - although I think that many would use human input in RHS of Conditions.

Also I think that unless you check every LHS if it is an field of the sObject (or field of parent with dot notation -> recursive check).

However now that I am thinking of that the RHS is fixable easily with:

private static String toString(Object obj) {
        if (obj == null) {
            return 'null';
        } else if (obj instanceof Set<Id>) {
            return toString((Set<Id>) obj);
        } else if (obj instanceof Set<String>) {
            return toString((Set<String>) obj);
        } else if (obj instanceof Set<Integer>) {
            return toString((Set<Integer>) obj);
        } else if (obj instanceof Set<Decimal>) {
            return toString((Set<Decimal>) obj);
        } else if (obj instanceof List<String> ||
                obj instanceof List<Id>) {
            return toString((List<String>) obj);
        } else if (obj instanceof List<Decimal> ||
                obj instanceof List<Integer>) {
            return toString((List<Decimal>) obj);
        } else if (obj instanceof Id ||
                obj instanceof String) {
            return '\'' + String.escapeSingleQuotes(String.valueOf(obj)) + '\''; // modified HERE 
        } else if (obj instanceof Datetime) {
            return JSON.serialize(obj).replaceAll('"', '');
        } else {
            return String.valueOf(obj);
        }
    }

Queries like this should be run at least with WITH SECURITY_ENFORCED anyway.

However my usecase includes pretty complex Conditions (with quoted literals, multiple nested braces) -> the only thing I can think of how to make it secure is to check that every '(' is matched with closing ')' in correct order, discarding all braces inside string literals.

HenryRLee commented 3 years ago

Well, yes, the LHS has a potential of being injected if that comes from a user input. Actually I've never thought of LHS of a condition coming from a user input. I've expecting programmer specifying the field name using string literals. In that case, if the programmer is trying to inject his own code, it is not of my concern. But, yeah, we should at least give out a warning that, if the LHS comes from a user input, it'd pose injection threat. Alternatively, as you suggest, do a recursive check whether the field is valid. We had similar code in the SELECT statement, but that has been deleted (f5b2b7c65ec4479e992584e7178c1425788b4305).

I don't think RHS can be injected. In debug or toQueryString methods, the user string gets single quotes escaped. (Although I can't verify it now. You can try to prove me wrong. This helps us improving our code.) And most of the time, executing the query immediately using methods such as run or fetch, the RHS values would be saved in variables and the query string would use variable binding. For example: WHERE Name LIKE :argv1. That is guaranteed to prevent any injections. Having said that, there is still a method called buildDateLiteral that receives user input without dealing with injections (which isn't hard to fix actually).

In your scenario, are parentheses are really needed in the condition string? Because if it is a static user string, any complex logical expression can be flatten into an expression without parentheses. If the string is dynamically composed, I would still suggest using public methods of the Query.Condition class to do the composing.

If you still need parentheses, well, maybe we can discuss further. Your proposed matching brackets checking looks good to me. It's just the implementation doesn't look easy. (In theory this is not a regular language so it cannot be done using regex.)

HenryRLee commented 3 years ago

Oh, just realized we don't have escapeSingleQuotes in toString methods. That was your patch. Sorry about that. Yeah, that's something we should fix.

aranwe commented 3 years ago

(Although I can't verify it now. You can try to prove me wrong. This helps us improving our code.)

Totally agree. This whole ticket is about that :)

And most of the time, executing the query immediately using methods such as run or fetch, the RHS values would be saved in variables and the query string would use variable binding. For example: WHERE Name LIKE :argv1.

True.

Oh, just realized we don't have escapeSingleQuotes in toString methods. That was your patch. Sorry about that. Yeah, that's something we should fix.

Yeah, sorry, should have displayed it as diff > sending this in a PR :)

aranwe commented 3 years ago

68

aranwe commented 3 years ago

Also re LHSides - I guess there is no need to add extra complexity to the code base and check it via introspection - the query would fail hard anyway - maybe just simply check for [a-zA-Z][a-zA-Z0-9_.]+[a-zA-Z]?

HenryRLee commented 3 years ago

Hi @aranwe. PR looks good. Thanks for your contribution.

I think this regex check for LHS is a good solution. A new PR is welcome. But I think an object/field name can end with a digit? (I remember I've seen this somewhere in a standard object.)

jordanhenderson commented 3 years ago

Hey guys, is this going to be merged any time soon? This patch looks to be perfect for a use case we are working on, without allowing SOQL injection (ie where clauses within flow inputs).

HenryRLee commented 3 years ago

Hi @jordanhenderson, actually this patch has been merged. You may find the pull request in #68.

jordanhenderson commented 3 years ago

Ah, thanks for that! Thought there was some sort of issue as this issue was still open and there was a CI check failure on #68. Also, thanks for the quick response and your support on this awesome library.

aranwe commented 3 months ago

Closing this not to cause further confusion :)