Closed rafaeling closed 3 months ago
Attention: Patch coverage is 75.19026%
with 163 lines
in your changes missing coverage. Please review.
Project coverage is 50.92%. Comparing base (
963a297
) to head (29a968b
).
Files | Patch % | Lines |
---|---|---|
databroker/src/grpc/kuksa_val_v1/val.rs | 71.42% | 82 Missing :warning: |
databroker/src/glob.rs | 78.04% | 81 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This may be faster, but it breaks a lot in the existing API as well.
before
Test Client> getValue Vehicle.CurrentLocation
[
{
"path": "Vehicle.CurrentLocation.GNSSReceiver.MountingPosition.X"
},
{
"path": "Vehicle.CurrentLocation.Latitude"
},
{
"path": "Vehicle.CurrentLocation.VerticalAccuracy"
},
{
"path": "Vehicle.CurrentLocation.HorizontalAccuracy"
},
{
"path": "Vehicle.CurrentLocation.Heading"
},
{
"path": "Vehicle.CurrentLocation.Timestamp"
},
{
"path": "Vehicle.CurrentLocation.GNSSReceiver.MountingPosition.Z"
},
{
"path": "Vehicle.CurrentLocation.Longitude"
},
{
"path": "Vehicle.CurrentLocation.GNSSReceiver.FixType"
},
{
"path": "Vehicle.CurrentLocation.Altitude"
},
{
"path": "Vehicle.CurrentLocation.GNSSReceiver.MountingPosition.Y"
}
]
Test Client>
After
Test Client> getValue Vehi
EXCEPTION of type 'Exception' occurred with message: 'Wrong databroker version, please use a newer version'
To enable full traceback, run the following command: 'set debug true'
Test Client> getValue Vehi
EXCEPTION of type 'Exception' occurred with message: 'Wrong databroker version, please use a newer version'
To enable full traceback, run the following command: 'set debug true'
Test Client> getValue Vehicle.CurrentLocation
{
"error": {
"code": 404,
"reason": "not_found",
"message": "Path not found"
},
"errors": [
{
"path": "Vehicle.CurrentLocation",
"error": {
"code": 404,
"reason": "not_found",
"message": "Path not found"
}
}
]
}
Test Client>
(I guess breaking AutoComplete is another side effect of this)
So that would mean a major version bump and losing features
If RegExp (I guess the extending/matching with stars) is the performance bottle neck here, would it not be better, to only use RegExp if needed?
As in
if "*" in path do regexp path
else
doSameAsHere() and
if leaf: return stuff
if branch "regexp it"
else "404"
That would just cost a bit of ifs, and I guess the heavy part of the regexps is, that you can't precompile (becasue they come form client). So implementing like above: A client that needs performance ONLY subscribes to Lead nodes and is fine,. Everybody else can knick themselves out
This was a hot fix to see if we could get some performance improvements.
We are still considering doing more research to keep the wildcard combined with good performance without these PR changes.
If the research results are not so promising, there is an additional idea of a new algorithm to partially support the wildcard only for the .*
and .**
cases while keeping this PR O(1) algorithm.
I will change it to Draft as it should not be merged yet.
One more thing regarding auto-completion, yes, it is good to have it and it has been considered to implement an extra service method for Metadata that will support wildcards, which means that the user would not need wildcard support for Get as all signals could be stored locally by the client once in advance.
Yes, in terms of an updated/new API I am sure there may be better ways to split/do this
However, for the current API, we should not merge something into main, that changes behavior significantly (I mean this doesn"t reinterpret some unclarities, it would actually remove features contradict what is documented)
"Engineering" wise,I am not sure, but have you looked into whether "globbing" libraries are faster? Like i.e. https://github.com/devongovett/glob-match , I however lack the background to know if this is "fundamentally" faster then the regexp construction of state amchie and machting. And it also wil be a bandaid, because non-wildcard hashmap lookup shall be O(1) (with respect of number of signals, the hashing obviously would be O(l) with l being length of the path), and I doubt the most clever glob matching when backed by a "normal" map/list can ever be faster than O(n)..... But surprise me, if you can deliver O(1) we will probably take it :D
so tldr: I still think now and also in a future APIs ome kind of matching useful, but should be implemented in way that it is only "used when needed"
Off topic here, but since you mentioned it: I guess even when "discoverability" is in a metadata API a user might still want to subscribe to "a branch" (which can be seen as a "wildcard" sort of), otherwise something like the recorder from https://github.com/eclipse-kuksa/kuksa-csv-provider gets really hard, consider you want to generate a trace of everzthing running throug databroker (or a subbranch) but are not able to say "subscribe vehicle", but instead need to list/do 1000 individual paths
Trying out the library you posted, will test it and check the performance results.
Wildcard now uses glob-matching library https://github.com/devongovett/glob-match Can be reviewed
I think, that sutff like
**.*.*.*.Position
is not supported anymore is fine, does not seem very real world, however is something preventing us to keep this
Vehicle.Cabin.Sunroof
working as before? I have not checked the implementation, but might something like "if isBranch ad .**" make sense somewhere?
Rationale is: the first case I have never used, and find it hard to believe anyone did, the second one I definitely use often at least via CLI
I think, that sutff like
**.*.*.*.Position
is not supported anymore is fine, does not seem very real world, however is something preventing us to keep this
Vehicle.Cabin.Sunroof
working as before? I have not checked the implementation, but might something like "if isBranch ad .**" make sense somewhere?Rationale is: the first case I have never used, and find it hard to believe anyone did, the second one I definitely use often at least via CLI
Vehicle.Cabin.Sunroof
is not supported anymore, instead Vehicle.Cabin.Sunroof.**
or Vehicle.Cabin.Sunroof.*
should be use.
As far as I understand, there is no way in Databroker to determine if a path is a branch or not; Databroker can only internally query whether a path is contained in the database or not.
Previously, it was supported because Regex does not have as many limitations as the glob matching library approach.
I think, that sutff like
**.*.*.*.Position
is not supported anymore is fine, does not seem very real world, however is something preventing us to keep thisVehicle.Cabin.Sunroof
working as before? I have not checked the implementation, but might something like "if isBranch ad .**" make sense somewhere? Rationale is: the first case I have never used, and find it hard to believe anyone did, the second one I definitely use often at least via CLI
Vehicle.Cabin.Sunroof
is not supported anymore, insteadVehicle.Cabin.Sunroof.**
orVehicle.Cabin.Sunroof.*
should be use. As far as I understand, there is no way in Databroker to determine if a path is a branch or not; Databroker can only internally query whether a path is contained in the database or not. Previously, it was supported because Regex does not have as many limitations as the glob matching library approach.
Ah, I remember, because "branches" are not really a thing in our internal data structure. In terms of keeping API compatibility, would something be feasible like
inputstr=`Vehicle.Cabin.Sunroof`
res = glob_the_shit_out_of_this(inputStr)
if res=no match
res = glob_the_shit_out_of_this(inputStr+".**")
It should not really cost anything , when querying direct leaves.
I understand, for any kind of new API we would always use the "best" approch, but this is slightly modifying existing API behavior, so I would prefer to keep behavior changes to a minimum
I think, that sutff like
**.*.*.*.Position
is not supported anymore is fine, does not seem very real world, however is something preventing us to keep thisVehicle.Cabin.Sunroof
working as before? I have not checked the implementation, but might something like "if isBranch ad .**" make sense somewhere? Rationale is: the first case I have never used, and find it hard to believe anyone did, the second one I definitely use often at least via CLI
Vehicle.Cabin.Sunroof
is not supported anymore, insteadVehicle.Cabin.Sunroof.**
orVehicle.Cabin.Sunroof.*
should be use. As far as I understand, there is no way in Databroker to determine if a path is a branch or not; Databroker can only internally query whether a path is contained in the database or not. Previously, it was supported because Regex does not have as many limitations as the glob matching library approach.Ah, I remember, because "branches" are not really a thing in our internal data structure. In terms of keeping API compatibility, would something be feasible like
inputstr=`Vehicle.Cabin.Sunroof` res = glob_the_shit_out_of_this(inputStr) if res=no match res = glob_the_shit_out_of_this(inputStr+".**")
It should not really cost anything , when querying direct leaves.
I understand, for any kind of new API we would always use the "best" approch, but this is slightly modifying existing API behavior, so I would prefer to keep behavior changes to a minimum
Implemented your suggestions. Now it is also compatible for branches use cases like Vehicle.Cabin.Sunroof
Please test it, meanwhile I will try fix the format errors as soon as possible.
@rafaeling sounds good
It would be good, if the conflicts are resolved, because that seems to prevent CI runs, so currently can not lazily test by just pulling the docker image :D
Built manually and checked behavior. Looking good to me!
This change means a 8 times faster improvement for the Get method, which drops to 7.8 ms for 5000 requests compared to the main branch implementation, which is 60 ms.
See the wildcard exceptions that have been made obsolete by this change: https://github.com/eclipse-kuksa/kuksa-databroker/blob/9ec27697af5d7594245e5c7cd5ebc2eb687abb80/doc/wildcard_matching.md