Closed nirmalneupane closed 2 years ago
Hi @nirmalneupane, \'
escapes the quote for python but not for the pattern. So your first example creates the pattern [url:value = 'http://example.com''''']
.
You'll need to escape the backslashes as well for python so they appear in the pattern.
>>> x = stix2.Indicator(pattern_type="stix", pattern="[url:value = 'http://example.com\\'\\'\\'\\'']")
>>> print(x.pattern)
[url:value = 'http://example.com\'\'\'\'']
This seems inconsistent and probably creates downstream issues that are using this library downstream. Why does (stix2.ObjectPath('url',['value']),'http://example.com\'\'\'\'') not require double escape characters then?
Most of the times, we are programmatically using the library and not adding double escape characters manually. Because of this limitation, even the function string.encode('unicode_escape') doesn't work. Can you suggest a programmatic way to sanitize escape quotes and other characters that are likely to be included in URL indicators geared towards injection attacks.
Try pasting your code snip into a Python prompt:
>>> "["+str(stix2.EqualityComparisonExpression(stix2.ObjectPath('url',['value']),'http://example.com\'\'\'\''))+"]"
"[url:value = 'http://example.com\\'\\'\\'\\'']"
Note that you get the same string as Chris showed. The reason your second code snip works is because it's creating a string containing a STIX pattern with the correct syntax. Your first code snip doesn't, so it fails. When you create a pattern AST, you are providing the individual pieces of the pattern and leaving it up to the library code to ensure proper escaping:
The AST code is written to produce a correct STIX pattern. When you provide a pattern string yourself, it's your responsibility to format it correctly. Either way, the library attempts to parse the pattern string to ensure the pattern is valid. If parsing fails, you get that error.
All you need to do is escape single quotes and backslashes. The reason your first pattern string is wrong is because you didn't escape single quotes:
>>> print("[url:value = 'http://example.com\'\'\'\'']")
[url:value = 'http://example.com''''']
The embedded single quotes require escaping, as follows:
>>> print("[url:value = 'http://example.com\\'\\'\\'\\'']")
[url:value = 'http://example.com\'\'\'\'']
You could use the AST classes if you wanted to, or just insert the necessary escape characters. The code quoted above does this, for string constants (you couldn't use that on the whole pattern).
Here's a variant of your first snip that works by the way:
Indicator(pattern_type="stix", pattern=r"[url:value = 'http://example.com\'\'\'\'']")
Note the use of a raw string, in which backslashes are not interpreted as escape characters.
The AST code is written to produce a correct STIX pattern. When you provide a pattern string yourself, it's your responsibility to format it correctly. Either way, the library attempts to parse the pattern string to ensure the pattern is valid. If parsing fails, you get that error.
I think I get it now. From the implementation, it looks like if I create a String Constant object from patterns module and use that to pass into the Indicator constructor, the library quotes will be escaped by the library and get desired result. Example
url_indicator = r"http://google.com''''''''" url_indicator = stix2.StringConstant(url_indicator) str(url_indicator)
"'http://google.com\'\'\'\'\'\'\'\''"
Anyways, maybe few line in documentation for this behavior might benefit people who are just getting introduced to the library.
Example:
Indicator(pattern_type="stix", pattern="[url:value = 'http://example.com\'\'\'\'']")
However, following is a workaround that works, above should have the same effect, imo:
Indicator(pattern_type="stix", pattern="["+str(stix2.EqualityComparisonExpression(stix2.ObjectPath('url',['value']),'http://example.com\'\'\'\''))+"]")