Open elfenpiff opened 2 years ago
@budrus @FerdinandSpitzschnueffler @MatthiasKillat @elBoberido @mossmaurice @dkroenke What do you think?
If you have additions to the feature descriptions please add them directly in the description.
Maybe a good idea. We have today different mappings for rmw_iceoryx, wirhin Cyclone DDS and some kind of ara::com best practice. The constraint would be that this is all possible again and then we have to adapt these
Good idea. I think especially for the partial search, the prefix tree @MatthiasKillat is working on will be very efficient.
The question is: when do we want to introduce this change? If we want it for the next release, parts of the refactoring of ServiceDiscovery
and ServiceRegistry
that is currently ongoing will be obsolete soon.
We could also encode the data type into this, e.g. "/my/service/data_type/version"
-> "/my/service/int64_t/1"
. Not quite sure if it should be in this string or if there should be a second string for the data type description e.g.
struct ServiceDescription {
string name;
string typeId;
};
And it would be used like ServiceDescription("/my/service", "/int64_t/1")
or ServiceDescription("/my/service", "")
. In the latter example we could print a warning that no type check can be performed.
@elBoberido I would really love to keep everything in one string to make it as simple as possible. Otherwise we have to write a little parser to translate all the types and the question is where are they stored and how to perform a type check? How do we get from a user defined type to the string of the user defined type? What if there are multiple user defined types with the same string?
I would also would love to have this as soon as possible - at best for 2.0. But then we make us unnecessary pressure - something I hate. But what about a compromise?
m_service
This would leave the old API intact, we do not have to change anything but would gain access to the new feature. If everything plays out nice and fast we could remove the old API and if not we leave the old API in 2.0 and remove it later. But all new features will only work with the single string service description so when we remove the old API we do not have to refactor much.
And maybe the findService feature is much easier to implement with a single string service instead of having 3 ones? @mossmaurice @FerdinandSpitzschnueffler @MatthiasKillat
I have not thought much of it, but a few points.
/
or similar. It makes searching even more of a hassle (details below).Addendum: if ServiceDescription is basically only the type information multiple ServiceIndentifiers could have the same type information. If it contains the identifier as well (composition), it is unique.
@elfenpiff well, it depends what you want to communicate with that string. If it is only one string, you have to encode the service semantic type and the service data type into one string. In DDS/ROS for example this is not done and there is a topic name and a separate type name, e.g. ros_discovery_info
and rmw_dds_common::msg::dds_::ParticipantEntitiesInfo_
. MQTT on the other hand somewhat encodes this into one string, e.g. hypnotoad/1984/audio/ogg
and hypnotoad/1984/audio/mp3
.
The check would be a strict comparison of the strings and it is the responsibility of the user to add sane strings or just use an empty string and skip this check. For safety, we would require to have this and a tool could generate this types from an IDL. We could even have a template toTypeName
which returns an empty string in it's default implementation and can be specialized by the user. We could also automatically derive this if the type has some constexpr defined, e.g. VERSION
and TYPE_NAME
. After writing all this stuff, I guess typeId could automatically be inferred for the typed API and we could have a cxx::variant<None, AutomaticTypeDeduction, uint64_t, cxx::string>
as second parameter for the ServiceDescription
with a default value of None
.
cc @budrus @MatthiasKillat @FerdinandSpitzschnueffler @mossmaurice @dkroenke
As for findService
with one string. Depending on what should be searched for (I am considering the Autosar use case), this is not a good idea because it requires searching for the separator (e.g. /
) in general if we e.g. want to find all instances or all events of a specific instance etc. This degenerates into almost brute force searches even with a prefix tree (a map would do worse here).
While prefix searches are fast and easy though (e.g. find all instances of a specific service with the proposed encoding), general substring searches are not. Those require suffix
trees or other elaborate structures which require quite a lot of storage to allow the fast lookup. Brute force is not an option for this, even for a small number of entries (say 100-300 events) I would argue.
In my opinion this will not help us in the short-term and just create more problems to be solved quickly, more extensive refactoring etc. It can be done, but I think we should not change it right now before 2.0.
@elBoberido You are right that comparison for equality is easy with one string (necessarily linear in the encoding size), it is too restrictive for the user I think. This is why I would like the split of the ServiceIdentifier
and ServiceDescription
in the long term (but not now for 2.0). Of course the user is free to encode anything she wants in the string (up to the available size), but I would not consider this the intended way to use the string for the user.
And if you start with partial searches in such a "can contain everything" string it just gets out of hand.
Note that with the 3 existing strings the user of iceoryx already has much of these possibilities, she can just leave unused identifiers empty. I just do not see the benefit without considering the usage implications more carefully.
As for equality comparison: it makes little difference whether we check 3 small strings vs. 1 larger string of approximately the combined size or similar.
For inexact searches it gets more problematic I would say, and that is the major downside. Prefix searches are in principle alright, but I would not commit to this until the prefix tree is somewhat proven to work properly. I also still have some issues with the memory consumption of storing this search information efficiently.
A vector of strings would be feasible as a fallback if we do not want to perform inexact searches up to say 100-500 elements (maybe, just a guess).
@MatthiasKillat You are right that for some search scenarios it will be faster if we have three separate strings but for memory consumption it might be beneficial to have only one. Just a wild guess, I have no data to back this up. Partial searches can also be optimized if we restrict the search tokens, e.g.
/
and then match the MyService
/
and the match MyInstance
So something like pre-filtering with a reduced filter set followed by a more sophisticated search or a two pass search.
We could have an API like thisuint64_t findService(uint64_t advanceToSeparatorWithNumber, string searchTerm, std::function(void(string))& f);
uint64_t findService(string startsWith, string searchTerm, std::function(void(string))& f);
Someone using iceoryx for an ara::com implementation could then wrap it into something like this
constexpr uint64_t SERVICE {1};
constexpr uint64_t INSTANCE {2};
constexpr uint64_t EVENT {3};
std::vector araComWrapper(whatever) {
std::vector<> ret;
findService(SERVICE, "Radar", [&](auto&){ ... ret.push_back(...); ...}) // finds services "Radar" and "RadarFoo";
findService(SERVICE, "Radar/", ...) // finds only service "Radar" but not "RadarFoo";
or
findService(INSTANCE, "Right/", ...) // finds instance "Right" of services "Radar" and "Camera"
or
findService("/Radar/Right/", "*", ...) // finds all events for service "Radar" and instance "Right"
}
With this restrictions on the pre-filtering, it could still be done quite efficiently.
Oh, as it looks like, zenoh also has only one string for this but without encoded information for the data type. Therefore I would lean towards having one string as ServiceIdentifier
and an optional second one for the data type, which could automatically be deduced by some template magic.
cc @budrus @elfenpiff @FerdinandSpitzschnueffler @mossmaurice @dkroenke
@elBoberido I am not against one string as the identifier mid-term (but would like a separate identifier for the type and additional info, whatever this may be).
Just for 2.0 it I think it will lead to problems and break things we have yet to consider (and surely CycloneDDS, ROS, rmw_iceoryx). It will also require changes in C-API, examples, tests and so on.
Regarding the search:
The interface looks alright, but it still does not answer the question how to find all matches to */MyInstance/*
efficiently, where the *
is a wildcard. If one knows the number of separators one can use this info in the indexing scheme internally (so basically as planned, as this is equivalent to knowing the number of substring between separators). But as I understand we do not have this info as the user is free to use separators as she sees fit and then prefixtree index at least breaks (as does any indexing structure working on complete keys, such as maps).
What if the user for some reason decides to define patterns like u/v/x/y/z
and wants to find everything matching */v/*/y/*
. Sure, one can always have brute force but then the question is: how many entries are we talking? I was always operating on the assumption of the order 10³. There it starts to get painful without proper indexing (which is the purpose of prefix trees, suffix trees, maps and so on). I will think about how to solve this problem below for an arbitrary number separators.
Before we do any of this, we should understand the search scenarios, order of entries to manage and the full implications of an API change like this in more detail. I am less concerned by the search logic, something is possible there (but I would like to test my current approach ...). I am much more concerned by all the changes at the user side (ports and so on).
Maybe one can have a phase where both versions co-exist, i.e. additional constructors for the ports with just one string etc.
@MatthiasKillat I'm also for two strings, one ServiceIdentifier
and one ServiceDataType
for example.
Regarding the search. Your example with */y/*/y/*
would not be supported. There are only two options for pre-filtering
ServiceIdentifier
must start with the provided filter stringThe search would then only be applied to the characters following the pre-filter, e.g. for /Radar/Right/Whatever
/Radar/Right/
, the search would be applied to Whatever
2
, the search would also be applied to Whatever
While the user would be free to decide how many separators to use, it would also be up to the user to follow her schema when doing the query. Let's assume iceoryx is used for an ara::com implementation, then the schema would be /service/instance/event
, therefore your example with */MyInstance/*
would map to
uint_64_t INSTANCE {2};
findService(INSTANCE, "MyInstance/", [](){...})`;
// or depending on how matching behaves, with a * at the end
findService(INSTANCE, "MyInstance/*", [](){...})`;
To integrate this into iceoryx, one could either follow the suggestion of @elfenpiff or do the following (not sure which one would be faster/less invasiv):
ServiceIndentifier
ServiceDescription
to ServiceIdentifier
Publisher
/Subscriber
/etc. use the ServiceIdentifier
ServiceDescription
via doxygen for iceoryx v2 and via [[deprecated]]
attribute for iceoryx v3With this approach we would already provide the new API with iceoryx v2 and still be source compatible with iceoryx v1. The ServiceDiscovery
would already work with the new schema.
I cannot tell if this is feasible for iceoryx v2 or if we should target v3 if we want this at all.
@MatthiasKillat okay, I now understand your concern with the separator filter option. You are right, this one would be a brute force search since all entries must be queried for the separator and the full string must be searched since one never knows if it's at the end of the string.
I don't have a good solution for this, to only idea that comes to my mind is to have a lookup substructure in the PrefixTree
with the indices of all separators which gives you a fast forward in the PrefixTree
. Don't know if this is feasible and it adds some functionality to the PrefixTree
which probably does not belong there.
@elBoberido would work with some kind of partial compression at the separators in the tree, which means separators are of a specific level are shared between entries. Then you need a fast forward to separator #x (easy) and could use it as "root" for the search on #substring. I need a whiteboard to sketch this...
Pretty specific though and assumes existence of these separators. I will think about this but first complete what I already have. While I like elaborate solutions (if the problem justifies them), we should not go overboard at the moment and finish it at least as planned.
Brief feature description
The ServiceDescription should be modified so that it only contains one string instead of 3 and iceoryx should not make the distinction between service/instance/event since it has no meaning in our world.
The ServiceDescription would then only consist of one single
cxx::string
which would also have the advantage that we could reduce the string length to 1 in an embedded low memory environment and use the underlying char for service enumeration which would be quiet memory efficient. For use cases like ara com or someip a binding could introduce logic to reestablish the service/instance/event logic by introducing for instance a separator and name the underlying iceoryx services like/ServiceName/InstanceName/EventName
.In
findService
we could then also remove the wildcard option and introduce an option likeexactMatch
orpartialMatch
. This would allow a user for instance to look fortoad
and they would findAllHailHypnotoad
,AllGloryToTheHypnotoad
withpartialMatch
. For more fine grained searches the ara com / someip binding could introduce extra logic to filter the results further.Furthermore it would give other frameworks the advantage to encode additional features cleanly into the service description like for instance a type, a class hash - by just defining some service syntax on their own.
At the moment we have cyclone dds which uses only partially those 3 identifiers in the internal iceoryx integration which makes it more clear that this is not the way to go. Or look at ros2 when it uses dds and how the services are mapped there.