Open Gimly opened 6 years ago
It's interesting that you should raise this as the TSI team yesterday sent me details of a few new query features. There are also now startsWith, endsWith and a matchesRegex. Since these are all changes to Filter.cs it makes sense to implement them all at once. I'll have a look.
I'd be happy help.
It seems that they haven't updated their documentation yet, but those new features will be really useful to me.
It could be also interesting to be able to add a way to set directly a string or json directly as a predicate, so that if they add new features we can use them directly without having to necessary wait for a new version of the library that supports them.
Awesome, help would be great, using a string was originally a feature and I seem to have accidentally removed it, I need more unit tests! I'll add that back as a starter, it's just copy and paste.
The expression based Where (i.e. x => x.DataType == "dataType") converts to a predicate string in the code. Do you know how to use "phrase" as a predicate string? I can't seem to get it working, it looks like it should be "[DataType] HAS 'dataType'" but that's not working for me. Perhaps I need more coffee.
I've added string based Where clauses back in here e5fa129 (and on Nuget as v1.0.90) You can do .Where("[DataType] = 'dataType'")
On investigation I believe the "phrase" filters are across all fields, the documents use the term phrase here:
'hello world' | true for events containing the phrase 'hello world'
This doesn't really fit with the expression based predicates because they are all boolean expressions. I'm happy to hear suggestions on how it might look if you have any.
Otherwise I think the other filters should be OK to implement:
Where(x => new[] { "Hello", "World"}.Contains(x.DataType))
could parse to:
[DataType] IN ('Hello','World')
I need to look at the startsWith, endsWith and matchesRegex. I assume that if Regex is available then we could implement:
Where(x => x.DataType.Contains("Hel"))
Tests added for IN, startsWith and endsWith here: 33e588179374d1a40c4aabd57b160a9a8e660a32
I will look at implementing these in the next few days. I haven't been brave enough to look at regex yet.
I think it'll be needed to create a specific method (extension method?) for the regex option.
To get something like Where(x=> x.RegexMatches("$hello[wW]orld^"));
So, I made your unit tests pass 😄. It's interesting to play with those LINQ expression trees, very powerful. Creating the pull request now.
I've merged your implementation in, I think there are a few more things we need to look at though as string.Contains throws a null reference exception. I fiddled with the regex though and contains could works as:
(matchesRegex([data.type], '^*ello*'))
I've added a test in for this which fails at the moment.
@colethecoder I've created a PR (#4) that implements the unit test as it was, but are you sure about the regex ? Shouldn't it be:
(matchesRegex([data.type], '^.*ello.*))
With just the *
it's not a valid regex. *
means "repeat zero or more", so you have to say "what" you want to repeat. In our case .
, which captures any character.
I briefly tested that regex against TSI (using the predicate string version of the Where clause) and it seemed to work, but I could be wrong, have you tested the results you get from TSI if you do:
(matchesRegex([data.type], '^.*ello.*))
Well, just tried and it works for both. I'm really confused 😕. I've never seen a regex with *
without something before it. They seem to use RE2 but I don't see any difference in that part.
Anyway I've updated my PR to add the .
before the *
, it looks a bit more standard to me.
OK cool, changing that test makes sense to me. I'll scan through it and merge into the branch tonight. I need to think about how Chronological should handle regex in general given that it's RE2 and not the standard .Net regex.
As described here, Time Series Insights supports
in
andphrase
when comparing strings.I haven't seen anything to use those comparison in Chronological, did I miss it or is it not yet supported?
If it's not yet supported, how would that work? Maybe Linq's
Contains
could be used? If it's called on astring
, it's aphrase
, if it's called on a list, it's ain
.What do you think?