Rocon Resource Universal Identifiers [Platform Tuples]

stonier commented 10 years ago

@bit-pirate @jack-oquin @jihoonl @hughie @dwlee @piyushk

PlatformInfo doesn't fit its purpose very well any more. The orchestration and interactions black boxes meant it's had to evolve as we go...and it even got an Icon attribute as well, which is redundant for many use cases. The last couple of demos has seen some spaghetti creep in too - quick and dirty functions hither and thither. So I'm going back to the drawing board and cleaning up. Let's nail the definition, centralise the relevant code and stabilise.

First proposal is in the following pull requests:

Summary:

Code centralised in rocon_utilties.platform_tuples
- PlatformTuple.msg added.
PlatformInfo.msg composites this.
All code switched to assume storage of PlatformTuple.msg types rather than strings.
- Strings need an awful lot of validation checks at every step
- Resetting parts of the string tuple is expensive, msg structure is not.
- Easy enough to consume (from yaml) or produce (for logging) human readable strings on demand.
Basic string representation example is ubuntu.precise.ros.turtlebot.dude

Next Step:

Jack has been talking about a proper uri for this. There was lots of discussion in this thread, e.g.

concert client: ubuntu.precise.ros.turtlebot.dude -> rocon:///ubuntu/precise/ros/turtlebot/dude
- the first posn could be utilised for a concert name: rocon://cybernetic_concert/ubuntu/precise/ros/turtlebot/dude
- fits very well for requesting/sheduling resources, see current example of resource requirements
concert client + rapp: rocon:///ubuntu/precise/ros/turtlebot/foo#rocon_apps/chirp
- could be representative of a concert client executing a rapp, or a potential executable for the scheduler
rapp: name->rocon_apps/chirp with platform_tuple dependency ->rocon:///ubuntu/*/ros/*/*
- this would be expressed in .rapp files

I'm not fond of representing a rapp with rocon:///ubuntu/*/ros/*/*#rocon_apps/chirp. Another rapp with rocon:///windoze/*/opros/*/*#rocon_apps/chirp would be seen as a different rapp by the scheduler. All it needs to know is the rocon_apps/chirp in order to satisfy resource requests.

Along with python regex's to make wildcards possible.

Questions:

What are people's thoughts?

Is PlatformTuple.msg as a rosmsg storage class suitable for both string representations?
Is there a better msg representation?
Are there any big pros or cons to using resource uri's?
Are uri's inconvenient?
- I find they are a bit more awkward to read/write than the dot representation
- But I can't find a better reason not to use them.

jack-oquin commented 10 years ago

I advocate using some subset of Python regular expression syntax for matching wildcard requests. It is easy to do and very powerful. For that we would require a dot in front of the asterisk fields, like this:

rocon:///ubuntu/.*/ros/.*/.*

My current implementation only matches the beginnings of strings, so the example above could be abbreviated to:

rocon:///ubuntu/.*/ros/

We should discuss whether that is desirable. Should we always match the entire string? If not, one could still write:

rocon:///ubuntu/.*/ros/.*$

Regular expressions can be quite powerful. This matches a turtlebot or PR2 with a name starting with 'dude' followed by one or more numbers:

rocon:///linux/.*/ros/(turtlebot|pr2)/dude[0-9]+

We should choose a subset of the full Python re syntax that can be supported by existing tools in other languages, especially C++.

During migration, it is easy on input to recognize a string lacking the required rocon:// prefix and convert it to the equivalent URI representation.

stonier commented 10 years ago

Thanks for the input Jack. We have a window now for a few of weeks where we can bulldoze in a few underlying changes.

Unless @piyushk has some dependencies on things I'll probably just push the resulting implementation across the entire system as fast as possible. May/Igloo will probably be our target point for stabilising for major releases and I want to iron out the rough edges before then.

jack-oquin commented 10 years ago

I'm not fond of representing a rapp with rocon:///ubuntu/*/ros/*/*#rocon_apps/chirp. Another rapp with rocon:///windoze/*/opros/*/*#rocon_apps/chirp would be seen as a different rapp by the scheduler. All it needs to know is the rocon_apps/chirp in order to satisfy resource requests.

I am unclear about the context of interpreting these patterns. Right now, rapps occupy a separate namespace from client platform names, don't they? The # notation is mostly a human-readable representation of the fact that this client is running that particular rapp at some time.

Nevertheless, a wildcard matching any platform in the local concert supporting that rapp could be constructed like this, if desired:

rocon:///.*#rocon_apps/chirp

That is wordy but not hard to read. The specific example above could be:

rocon:///(ubuntu/.*/ros|windoze/.*/opros)/.*#rocon_apps/chirp

The bigger question in my mind is the relationship between clients and rapps. Each client advertises a set of rapps that it currently supports, and the scheduler must match one of those names exactly to satisfy a resource request. The wildcard patterns only apply to the platform info, which is effectively the canonical name of that client.

It is fundamental that each client can only run a single rapp at a time. The scheduler tracks whether each client is available, and what set of rapps it currently supports.

jack-oquin commented 10 years ago

I am uncomfortable with PlatformTuple as a message element. It's contents are not clearly defined. They are only easy to validate because we skip that step.

Defining constants for the string values does not help. The field values are vague and overlap. When should I use "ubuntu" rather than "linux", or "iphone" rather than "apple"? Are "phone" and "tablet" instances of "smart_device", or just sometimes? There is a complex semantic network lurking beneath all these terms.

Using '' in PlatformTuple disallows patterns for specific alternates, which can easily be represented if we use regular expressions. Or, do we allow a regular expression in each field? That helps, but cannot express correlations between fields, like ``rocon:///(ubuntu/./ros|windoze/./opros)/.``.

But, my main concern is for future evolution of the interface. I doubt we know enough yet to conclude that a five-part name will be appropriate in all cases. Passing a single string allows many programs to ignore all those issues.

A standard parser utility module could provide structured representations for programs needing them. And, the parser may be able to conceal some future changes from its users. The cost of parsing a few strings seems small compared to message overhead. I guess the main question is: how many parser implementations are needed for the various application platforms and languages?

Whatever representation we come up with, I would like to see a relatively formal syntax and semantic definition, something like extended BNF plus a description of each term's meaning. If we use regular expressions, we need to specify their syntax too, perhaps via something like a subset of the POSIX egrep regular expression syntax.

stonier commented 10 years ago

Ok, getting to the meat of it!

I think we're agreed, keeping rapp identifier and dependency (namespace) identifier separate is the right way to go. In addition this is all working on the assumption that a concert client can only run a single rapp (that's not going to change for the lifespan of this project).

So to summarise all potential uses of rocon uri's within the concert:

A concert client identifier, e.g. rocon:///ubuntu/precise/ros/turtlebot/dude
A concert client + rapp identifier, e.g. rocon:///ubuntu/precise/ros/turtlebot/dude#rocon_apps/chirp
- This is useful for representing a potentially executable or executing concert resource.
A rapp dependency identifier, e.g. rocon:///ubuntu/.*/ros/.*/.*
A resource request dependency identifier, e.g. rocon:///ubuntu/.*/ros/.*/.*
A remocon identifier, e.g. rocon:///android/jellybean/ros/tablet/remocon01a46c
- Used for remocon-app matching

stonier commented 10 years ago

Semantics

+1 for pretty much everything. Actually those semantic strings were just thrown at the system for the first experiments over a year and a half ago now before we had any idea what it was going to look like and right now is the good time for a re-evaluation (bit embarrassing really). I'm also completely fine with dropping the '*' if it best matches the appropriate regex parsing module we use.

What's the best way in the software world to start formally moving forward? Are string constants, some regex subset and some python functions enough to begin with? Those terms with overlap - I don't have a use case nor used those specifically for a long time, so perhaps it doesn't need the complexity of a semantic network.

One possible use case might be for setting versions without getting even more verbose. e.g. in the current schema I've no setting for the version of ros, which may be very important for a rapp specification. Instead of 'ros', it could list 'hydro', 'raring' and use a semantic 'is-a' relationship to evaluate that it is a 'ros' where necessary. The same could be done to collapse the os/version pair.

The only other consideration I've had thus far is that ros messages are great places for defining constants since they get automatically generated into a variety of languages, but definitely need to remove overlap/ambiguity.

stonier commented 10 years ago

Message vs String

Internally I was using the message structure as a variable to store the platform tuple instead of a string. I always used a to_msg(str) function to construct it, which validated it on the way through and once validated, I could operate on it in a number of ways without any further validation. Storing the platform tuple as a string however would mean always having to do validation checks for every operation as well as expensively unnecessary dissecting of the string to retrieve parts of the tuple.

Key point here is not the message class itself, but dumping it into a structure which validates it on construction.

Perhaps a better approach is to use strings in the messages and create a python class which validates on construction with a string or after modifying it in any way. Example: python's urlparse module and ParseResult objects. Does this sound more viable?

stonier commented 10 years ago

@bit-pirate raised the same query I had - what are the advantage of switching to the slightly more cumbersome resource uri's? So far I can only see:

Moves from a custom format to an instantly recognisable standard format
Our objects are usable by something that knows uri's (I don't know of any use of them right now though)

Are there other reasons?

jack-oquin commented 10 years ago

There is no compelling advantage to a URI over some other syntax.

But, there were some minor considerations that led me to consider and then recommend it as an option:

Using . as a field separator interacts badly with any common regular expression syntax. The dot character is almost universally used to match any character, so it must be escaped everywhere we need to match an actual dot (\.). That is awkward and ugly.
Prepending rocon: as a URI scheme makes the string self-identifying. It is long enough to uniquely identify this project, but not too cumbersome.
It is easy to distinguish from the dotted format. I found that useful when migrating my own code to use the new syntax.

Some other minor advantages come to mind:

ROCON resource names do fit the definition of a uniform resource identifier, so the principle of least astonishment suggests adopting the usual syntax for those kinds of objects. I think it will be clearer to others encountering the ROCON project.
The URI syntax encourages us to specify these names more carefully. We should do that anyway, whatever syntax we decide to adopt.
When ROCON begins to take over the robotics world, we can register the rocon: schema with IANA. :smile:

jack-oquin commented 10 years ago

Message vs String

Internally I was using the message structure as a variable to store the platform tuple instead of a string. I always used a to_msg(str) function to construct it, which validated it on the way through and once validated, I could operate on it in a number of ways without any further validation. Storing the platform tuple as a string however would mean always having to do validation checks for every operation as well as expensively unnecessary dissecting of the string to retrieve parts of the tuple.

Key point here is not the message class itself, but dumping it into a structure which validates it on construction.

I have done things both ways in various projects. ROS messages often provide a good enough representation of the corresponding object for use throughout a program.

But, lately I find myself defining internal objects that translate to and from the corresponding ROS message, sometimes containing a copy of it. That way I can provide custom constructors and other methods, so the object fits better with the program's implementation language. And, it sometimes makes the program slightly more portable to and from non-ROS frameworks.

The ROS community generally prefers messages with more structure than a string or a collection of strings. Any time a message has a string, we need to define what is and is not valid, a tricky issue of language definition. For ROCON resource names we really do need strings, so now we must define what they mean. Dividing a single platform_info string into five strings does not really make that any simpler to do, and seems less flexible.

Perhaps a better approach is to use strings in the messages and create a python class which validates on construction with a string or after modifying it in any way. Example: python's urlparse module and ParseResult objects. Does this sound more viable?

The urlparse example seems excellent. For Python, we should consider wrapping our own custom parser around the newer urllib.parse module, if we decide to adopt URI syntax. I presume there are similar tools for C++.

I only recently discovered the advantages of using a Python class instance as a flexible and easily extended struct. Python really is a wonderful language.

jack-oquin commented 10 years ago

Semantics

What's the best way in the software world to start formally moving forward? Are string constants, some regex subset and some python functions enough to begin with? Those terms with overlap - I don't have a use case nor used those specifically for a long time, so perhaps it doesn't need the complexity of a semantic network.

We certainly do not want to actually implement a semantic network. Hierarchical relationships can be well-described using EBNF:

ros_distro = "groovy" | "hydro" | "indigo";
ubuntu = "precise" | "raring" | "saucy" | "trusty";
linux = ubuntu | debian | fedora | rhel | mint | suse;
operating_system = android | darwin | freebsd | ios | linux | windows;

This does provide some useful detail, such as making it clear that we expect "hydro" rather than "hydro_medusa" and "precise" rather than "12.04" or "precise_pangolin".

The obvious problem is maintaining these lists and keeping them up to date. Trying to list all relevant Linux distributions seems hopeless. Maybe we should just list ones we actively support, plus a catch-all "linux_other" category.

The ROS community is already struggling with that problem. We should consider adopting the OS version names defined for rosdistro. They chose to merge the operating_system and linux categories, and there is some category confusion about different packaging systems like "macports" and "homebrew", but it is still better than what we are likely to invent on our own.

One possible use case might be for setting versions without getting even more verbose. e.g. in the current schema I've no setting for the version of ros, which may be very important for a rapp specification. Instead of 'ros', it could list 'hydro', 'raring' and use a semantic 'is-a' relationship to evaluate that it is a 'ros' where necessary. The same could be done to collapse the os/version pair.

+1 That seems worth exploring.

There may be some way to define regular expression constants matching various useful categories. For simple cases like ros_distro that would be easy, but I can't think how to define a nested hierarchy. I believe it may require a context-free grammar instead of a regular expression.

The only other consideration I've had thus far is that ros messages are great places for defining constants since they get automatically generated into a variety of languages, but definitely need to remove overlap/ambiguity.

+1 ROS message constants are an excellent way to define common values for multi-language projects. I have several times defined "messages" containing only constants, just as you did.

jack-oquin commented 10 years ago

A remocon identifier, e.g. rocon:///android/jellybean/ros/tablet/remocon01a46c

Used for remocon-app matching

I don't know this term. Please help me with a link to where it is described.

stonier commented 10 years ago

A remocon identifier, e.g. rocon:///android/jellybean/ros/tablet/remocon01a46c

Used for remocon-app matching

I don't know this term. Please help me with a link to where it is described.

http://redmine.robotconcert.org/projects/opp/wiki/Interactive_Clients

I'll have you running a qt remocon when we test the teleop service :neckbeard:

stonier commented 10 years ago

The ROS community is already struggling with that problem. We should consider adopting the OS version names defined for rosdistro. They chose to merge the operating_system and linux categories, and there is some category confusion about different packaging systems like "macports" and "homebrew", but it is still better than what we are likely to invent on our own.

+1 Will reference what ros has before I attack. Most of this os organisation would be in rosdep/rosdistro I think.

We certainly do not want to actually implement a semantic network. Hierarchical relationships can be well-described using EBNF:

ros_distro = "groovy" | "hydro" | "indigo";
ubuntu = "precise" | "raring" | "saucy" | "trusty";
linux = ubuntu | debian | fedora | rhel | mint | suse;
operating_system = android | darwin | freebsd | ios | linux | windows;

This seems to go quite beyond what regex can do I suspect. Digging around for some python parsers:

EBNF (simple): http://lparis45.free.fr/rp.html
PLY: http://www.dabeaz.com/ply/
pyparsing: http://pyparsing.wikispaces.com/

...will go back and look up what rosdep does first before going further.

stonier commented 10 years ago

I can't find much in rosdep/rosdistro apart from the labels they apply to rosdeps.

If we use heirarchial relationships EBNF style for os and rosdistro as you have described, we can have a rocon uri string like:

rocon:///precise/hydro/turtlebot/dude#rocon_apps/talker

Question is here, would we want any regex happening at all for the other components of the string? At a stretch you might for robots of a family, e.g. turtlebot1, turtlebot2..but you could make an ebnf family for them as well. For names, you might wish to use a regex, and specifying that regex could be an exception, but I wouldn't actually encourage use of regex's for names. That introduces black magic in a system (i.e. someone has to carefully name everything in setup and you rely on that for scheduling instead of scheduling intelligently).

Planning the attack:

Naming Conventions

Collapse os version version info the one key, i.e. ubuntu.precise becomes precise
Keep system name (aka ros, opros) but allow rosdistros instead of 'ros' as well.

Messages

Use rocon uri strings instead of the PlatformTuple object
- Variable name platform_uri instead of platform_info in Resources.msg?

Rocon URIs

Create a module and class inside rocon_utilities that works a bit like urlparse
- Maybe even use a urlparse object to store the rocon uri string internally
Try out the rp parser on the os/rosdistro/robot family naming conventions.

App Manager

Better platform sniffing by the app manager (right now it just sets linux.*.ros.robot_type.robot_name)
- Use rosdep code for this sniffing

jack-oquin commented 10 years ago

I was only using EBNF to express an approach to defining the symbols of the language in an hierarchical manner. My main goal was to avoid the messiness of semantic networks while defining something relatively simple, but complete.

There are two types of strings we need to define:

Fully-resolved client identifiers
Pattern-matching strings for making requests.

Each has its own "language" and syntax. My EBNF example was an attempt to specify (part of) a type 1 canonical name. For that we want uniqueness with no ambiguity. So, "precise" and "hydro" are terminal strings of that language. Every release of every operating system would have to define similar terminal strings. Every one would need to be unique. Given all those constraints, the canonical representation of a particular robot is deterministic. I am not certain that this is the best approach, but it seems worthwhile to work out more of the details.

For type 2 patterns, I was predisposed to use regular expressions, because they are powerful, relatively simple and fairly easy to use. However, I think we are both convinced that a regex is not powerful enough to handle hierarchical patterns. I am sure there are plenty of good alternatives, perhaps including somehow using EBNF directly.

One would normally defer parser implementation decisions until the language has been more-or-less fully specified. There are lots of options, and the best choice depends on the demands of the language. We will want to keep things as simple as possible, so parsing will probably be straightforward.

I don't have much time today for working on this, but I will definitely give it more thought.

jack-oquin commented 10 years ago

The ROS OS detection code is actually in rospkg.os_detect. That API looks useful for clients to automatically detect their current operating system.

Even if not, we could adopt the OS names they defined. On my current box, I get:

$ python -m rospkg.os_detect
OS Name:     ubuntu
OS Version:  12.04
OS Codename: precise

Note that the version and code name are not defined for some operating systems. That somewhat confuses the notion of always using code names for the canonical representation. But, the main goal remains: I want to specify unambiguously the exact strings ROCON clients should use.

jack-oquin commented 10 years ago

There is a lot I don't know about client names, and how they are used. Some questions:

Who assigns the last part of a client name?
How do we know it is unique within the concert?
Do services request interactive resources in the same way they do robots?
When do services care about platform info? If rapp names are unique, they could ask for any platform supporting the desired rapp.

It looks like the platform field describes the type of robot.

Is a specific model sometimes needed?
Are there other capabilities implied by various components of the platform info?
Why do services care about the client operating system?

If services intend to exchange ROS messages with the client, they need to specify the exact ROS distribution. If some other message protocol is used, it should presumably be specified in the "system" field. Is "system" a good word to describe that?

stonier commented 10 years ago

Did alot of thinking myself last night as well and also experimented with a parser to understand how flexible they are. But first, your questions, which are great - they help narrow down the thinking process:

Who assigns the last part of a client name?

This process has a few steps.

The robot is parameterised with its own name, e.g. dude.
The robot then advertises itself on a gateway network with a uuid attached, e.g. dude_01ac37f920b1432a
A concert conductor when it discovers this stores it in its own registry with a unique alias for easier reading/handling
- This alias becomes the concert client name.
- Stripped of the uuid, if it's the first dude robot, the alias will be dude.
- Stripped of the uuid, if it's the second dude robot it becomes dude2.

How do we know it is unique within the concert?

The process above - the conductor ensures it.

Do services request interactive resources in the same way they do robots?

Not really possible to 'request' for a human. Nor do you really want to leave humans waiting around until a service requests them. Hence the reason for differentiating a schedulable resource like a robot and an interactive client like a human. Instead the service provides a profile saying what kinds of interactive client applications may connect to the concert, the configuration of the app and any remappings it needs.This process is outside the scope of scheduling.

e.g. suppose your service is a 'make a map' service. The service would then provide a profile for starting rviz, with a .vcg configuring rviz with the exact views and connections for a user to visualise the service. When the user connects, he sees this 'make a map rviz' as an available interaction option and selects it.

When do services care about platform info? If rapp names are unique, they could ask for any platform supporting the desired rapp.

Rapp names are unique and so the ideal case would be to ask for platform info dependency of ....*. That is the goal anyway - services shouldn't have to care about what robot they are interacting with, i.e. abstraction layer.

For the services, the platform info is only there as a weak crutch to help us until the schedulers are madly brilliant enough to schedule better than the hints we provide them. Think of it as a practical stopgap that we should endeavour not to use.

It looks like the platform field describes the type of robot.

Also used to describe pc's, tablets, phones etc.

Is a specific model sometimes needed?

I imagine specific models will just have different names, i.e. turtlebot, turtlebot2.

Are there other capabilities implied by various components of the platform info?

This is quite general, maybe my next post will answer some points on this.

Why do services care about the client operating system?

Hopefully shouldn't have to.

stonier commented 10 years ago

Rocon URI Interested Parties

App Manager - it creates a rocon uri representing the platform it is running on (or proxying for)
- No need for patterns, every element can be fixed.
Rocon Apps - uses a rocon uri to represent what kinds of app managers can run it
- Patterns, e.g. rocon:///ubuntu/hydro/turtlebot|turtlebot2/.*
Services - uses a rocon uri as a filter for resource requests
- ideally rocon:///////*
Remocons - provides a rocon uri to identify what platform is connecting
- No need for patterns. e.g. rocon:///jellybean/hydro/tablet/remocon_135ac3
Interactions Handler - in the profile given by a service, a rocon uri is used to compare/filter runnable apps on remocon platforms that are connected
- Patterns, e.g. rocon:///precise/hydro/pc/.*
Scheduler - to compare/filter resource requests from services with available resources.

I had hoped to use the same rocon uri format for everyone, even if that means having the occasional field which is important for a particular use case and redundant for another (there is far more in common than redundant in the above use cases). It is worth keeping in mind that having multiple uri forms is an alternative though.

stonier commented 10 years ago

If services intend to exchange ROS messages with the client, they need to specify the exact ROS distribution. If some other message protocol is used, it should presumably be specified in the "system" field. Is "system" a good word to describe that?

Great point! I hadn't actually thought of specifying the version of concert 'communications', but that is critical - a groovy robot shouldn't connect to a hydro concert and a rocon app that does not exist for groovy but does for hydro should also be differentiated. I think we need this field.

The 'system' field actually has a different purpose - it is there for the app manager and rocon apps to help filter what apps are runnable on a platform. One of the project goals is to ensure that what is underneath the app manager doesn't have to be ros. We are using a 'ros' implementation of the app manager, but other korean groups could just as easily use an 'opros' implementation of the app manager which would need to filter for opros based apps.

jack-oquin commented 10 years ago

When do services care about platform info? If rapp names are unique, they could ask for any platform supporting the desired rapp.

Rapp names are unique and so the ideal case would be to ask for platform info dependency of ....*. That is the goal anyway - services shouldn't have to care about what robot they are interacting with, i.e. abstraction layer.

Makes sense.

In that case, let's define a platform_uri of '' to match any available platform providing the desired rapp name.

I would also suggest rapp or rapp_name in place of the name field in the Resource message.

jack-oquin commented 10 years ago

Why do services care about the client operating system?

Hopefully shouldn't have to.

This suggests that operating system should not be the top-level identifier in the name hierarchy.

Given that the name field is unique, it should probably be the first level of identification. That way, later fields that are not relevant could be omitted.

Not having to define all possible operating system options would simplify our current task considerably.

jack-oquin commented 10 years ago

Do services request interactive resources in the same way they do robots?

Not really possible to 'request' for a human. Nor do you really want to leave humans waiting around until a service requests them. Hence the reason for differentiating a schedulable resource like a robot and an interactive client like a human. Instead the service provides a profile saying what kinds of interactive client applications may connect to the concert, the configuration of the app and any remappings it needs.This process is outside the scope of scheduling.

I can imagine a robot needing to request human assistance for opening a door or pressing buttons on an elevator.

Is there a way for human interactive devices to be contacted in situations like those? Presumably, the human would need to register willingness to honor such requests.

jack-oquin commented 10 years ago

We are speaking as if the URI question has been resolved.

I am comfortable with that approach, and convinced that it is better than the dotted-field format we had been using.

But, I am open to other syntax suggestions. The reasons for using a URI are not compelling.

stonier commented 10 years ago

Yes - comfortable with the idea of using a rocon uri. Just need to sort out the fields and the operations we need on each field.

jack-oquin commented 10 years ago

Did alot of thinking myself last night as well and also experimented with a parser to understand how flexible they are.

General parsers are very flexible.

I used PLY one time, and like it. But, it is probably far more powerful than we need for this application. I think a top-down parser will probably work fine for us.

stonier commented 10 years ago

Last night playing around with that tiny rule parser package I listed previously, it's quite easy to use for ebnf relationships. The following is some example code for parsing operating system relationships - it easily handles fixed, supersets, OR relationships and the wildcard.

    operating_systems_rule = [
             'init operating_systems_list=[] ',
             'pattern           ::= os_zero operating_systems*',
             'os_zero           ::= os                          @operating_systems_list.append("$os")',
             'operating_systems ::= "|" os                      @operating_systems_list.append("$os")',
             'os                ::= "*" | windows | linux | "osx" | "freebsd"',
               'windows         ::= "winxp" | "windows7"',
               'linux           ::= "arch" | "debian" | "fedora" | "gentoo" | "opensuse" | ubuntu | "linux"',
               'ubuntu          ::= "precise" | "quantal" | "raring" | "ubuntu"'
              ]
    operating_systems_input = "precise|quantal"
    result = rule_parser.rp.match(operating_systems_rule, operating_systems_input)
    print("Input: %s" % operating_systems_input)
    if result is not None:
        print("  OS list: %s" % result.operating_systems_list)
        print("  Ubuntu : %s" % result.ubuntu)
        print("  Linux  : %s" % result.linux)
        #print("  Windows: %s" % result.windows) throws an AttributeError
    else:
        print("Error in parsing")

jack-oquin commented 10 years ago

That is cool. But, I'd rather avoid the task of defining canonical OS names, if we can.

stonier commented 10 years ago

I can imagine a robot needing to request human assistance for opening a door or pressing buttons on an elevator. Is there a way for human interactive devices to be contacted in situations like those? Presumably, the human would need to register willingness to honor such requests.

We've had discussions about this one too, but haven't been able to dedicate any time to it yet. We've used the headless launcher to enable an nfc tag to push through the role and app directly to the user. I'd like to operate with something similar - a background application on android for instance that receives notifications and if a user accepts, it does the same thing - push the starting of an app directly using the interactions handler as a kind of broker. It's a big job though and too early to start it.

stonier commented 10 years ago

I would also suggest rapp or rapp_name in place of the name field in the Resource message.

+1. I'll do that when I update the system for rocon_uri's, #7.

stonier commented 10 years ago

But, I'd rather avoid the task of defining canonical OS names, if we can.

Are you referring to using a string like "ubuntu"? I'm inclined to agree - could very easily lead to vague incompatibilities. Let's try without until we have a real need.

jack-oquin commented 10 years ago

Are you referring to using a string like "ubuntu"? I'm inclined to agree - could very easily lead to vague incompatibilities. Let's try without until we have a real need.

+1 Yes.

stonier commented 10 years ago

Thoughts about the exact fields.

os

Used by the app manager and the rocon apps to determine what apps are runnable by the app manager. Also used by the interactions handler and remocons in the same way. For remocons and app managers it is a fixed value of the lowest common denominator which would ideally include version information, e.g. 'precise' or 'jellybean'. For rocon app implementations it should be fixed for the package that is being made. For interaction profiles it would often be convenient to use OR combinations as well as supersets or a wildcard , e.g. 'precise | quantal', 'ubuntu', '*'. The case for OR combinations may not be strictly necessary.

system

Used by the rocon app manager and rocon apps to determine what kind of apps are runnable. For the app manager implementation this will be fixed e.g. 'opros', 'hydro' and for the rocon app this can be fixed for the implementation.

platform

Again by rocon app manager and rocon apps for runnability. Also for interaction handler and remocons for runnability. For app manager and remocons it will be fixed. For rocon apps and profiles fixed, OR'd, supersets or wildcards would be useful (e.g. this rocon app works on turtlebot | turtlebot2 | pr2). This is a coarse-grained filter. Other ways of filtering an apps runnability (currently in development) is to set this to * and let osrf's capabilities underneath determine if it is runnable.

name

Used by the app manager and remocons for unique identification by the concert conductor and interaction handler respectively. Could also feasibly be used in resource requesting to match against robot names if someone wants to do so, but I wouldn't encourage that. I would suggest only allowing fixed or wildcard entries for this.

NEW FIELD -> concert/concert_version

Represent what concert middleware the resource is compatible with. For ros robots, this is redundant with system, but for non-ros robots or embedded devices which are only running an app manager that acts as a kind of bridge to another system this is important.

stonier commented 10 years ago

Is the sequence of those fields important? The original sequence was determined purely by the logical checks used by the rocon app manager for rocon app compatibility. However the concert has various uses which have different perspectives, some redundancies and places priority of importance on the fields in a different logical sequence.

e.g. the name field is uniquely important to everyone except resource requests and interactions profiles. For the resource requests, we could argue they should just use the empty string anyway, but for interactions profiles, the string is important but name will almost always be the wildcard.

Sequencing would only seem to be important if we wish to drop some of those fields in the specification of the uri in various situations. Alternatively we could say that it requires each and every field, with wildcards where it's not important.

And of course an empty string to represent rocon_uri:///*/*/*/*/*/

jack-oquin commented 10 years ago

In URI syntax the sequence of the hierarchical fields determines what fields follow. In that sense, it is important. It's like a file system directory hierarchy.

When subordinate fields are not needed, they can be left out.

stonier commented 10 years ago

In that case I'm toying with the following:

concert_version/platform/os/system/name

Scheduler resources, concert clients, app managers, remocons will always have every field fixed.
Rocon apps also every field fixed in some database when they register their implementation somewhere.

We just had a discussion about rapps - currently this information is inside the .rapp definition. That needs to be moved out and stored in a meta-repository somewhere (just like ros packages, such information is registered with rosdistro when a package is released, there is no precise-quantal-raring|groovy-hydro information embedded in the package).

Resource requests should ideally gravitate to all *'s.
Interaction profiles...

This is the only one now where I think we need fancy handling of rocon_uri's. System (e.g. ros) and name for these are typically irrelevant. OR's and wildcards could be handled by just providing a list of fixed rocon uri's minus system and name parameters. Introduce OR's and wildcards if these lists start being cumbersome.

stonier commented 10 years ago

@bit-pirate brought up some points about the concert_version. Rather than just piggypack the ros version, give the concert it's own versioning.

Not being lame just piggybacking ros at a very small cost of not being instantly recognisable.
Actually would let us introduce concert breaking changes inside a ros release.
- Probably wouldn't do this very often
Version upgrades would happen anyway with ros middleware upgrades (it's a system breaking change).

jack-oquin commented 10 years ago

I agree that ROCON should have its own independent version name or number. We definitely want the ability to introduce a new version without waiting for a new ROS release.

The conductor (or some central authority) should check it with every client and service connecting.

But, if it must be the same for every component of the concert, I do not see why it needs to be part of the client name. Things like that should be implicit.

I do not have a good enough understanding of the app manager and its requirements. It appears that the OS and system information is primarily configuration information for the app manager. Doesn't the app manager know what version of ROS or opros it supports? Can each component figure out for itself what OS it is running on? Why does the app manager care what OS the client is running?

While asking naive questions, I wonder if it would be better to keep those details somewhere else, and not expose them in general user interfaces.

I can see a human wanting to request any available turtlebot, or even a specific one by name for tele-operation. But, I don't see why anyone would want to request a robot running some particular operating system. As long as it works and speaks some compatible ROS or opros message protocol, anything should do.

This persuades me that we should specify omitted fields in a pattern as matching anything, and reorder those fields to allow short requests for things users may actually care about:

rocon://concert_name/platform/client_name/system/os

This assumes that the system and OS really do need to be part of the name. It would be better if they didn't, but at least users could ignore them, making rather short requests with regular expressions for specifying alternatives:

An empty string implies rocon:///, which matches any resource in the current concert.
rocon:///turtlebot matches any turtlebot in the current concert.
rocon:///(turtlebot|segbot) matches any robot of either type.
rocon:///turtlebot/dude3 matches that specific turtlebot.
rocon:///.*/dude3 matches that specific robot, whatever its type.
rocon:////dude3 is probably equivalent, but not recommended due to poor readability.

stonier commented 10 years ago

Wow, we've quite literally waded through this topic....finally getting close to the end of the tunnel though.

This module at the moment adopts the last format in this thread:

rocon://concert_name/hardware_platform/client_name/application_framework/operating_system#rapp_name

It handles the shortened versions - this is convenient for requesting and scheduling, looks a bit awkward for remocons and interactions (which is where the system and os needs to be used inside the concert, not the app manager), but that is ok since most of resource uri handling will be for requesting/scheduling. Perhaps later we can split the uri formats, but I'd like to see how this fares for a while.

The ebnf rules are established in a single yaml file and there's a python api and demo script for accessing them if anyone outside the rocon_uri module should ever need to.

The concert_version dalliance is out, with a constant in rocon_std_msgs/Strings.msg and variable in rocon_std_msgs/PlatformInfo.msg for reference by the conductor and anyone else who wants to introspect.

Rocon URI fields now have the following parsing patterns applied (note no regex - couldn't think of cases where we're really use it):

concert_name : -
hardware_platform : OR operators via | and the wildcard *
client_name : OR operators via '|', wildcard * and trailing regex wildcard xxx*
- for matching dude1, dude_uuid, ...
application_framework : OR operators via | and the wildcard *
operating_systems : OR operators via | and the wildcard *
rapp_name : -

I'm rolling it out now - will see how things go with it, especially on the scheduler side.

jack-oquin commented 10 years ago

+1 Looks good.

I like using a well-known YAML file to specify the valid strings. We need to provide instructions similar to the rosdistro YAML files, so other groups can easily add their own. For example, UTexas BWI needs a "segbot" hardware_platform. If we are successful, there will be dozens of additional hardware platforms. If updates happen frequently, we'll probably need a validity test, similar to the one OSRF uses for rosdistro via Travis.

Even with the name hierarchy explicitly listed, I would still like to specify the valid characters using EBNF or a regexp. For example, we probably want to restrict identifiers to strings like ROS package names or Python identifiers. That is easy to do, and I'll volunteer to submit a definition some time soon. One reason to specify the valid identifiers explicitly is to leave us free to add more complex patterns in the future, if they prove useful.

Meanwhile we can continue to work with a shared, intuitive notion of the detailed lexical analysis. Parsers for languages too complex for regexp alone, generally start with a regexp-based lexical analysis step that produces tokens for higher-level analysis. That looks good for our application.

I still prefer .* instead of * for "matches any", because the Kleene star is too powerful a concept to sacrifice just to match trivial patterns. We should specify that an empty field also matches any valid value, and that elided fields are empty. I don't see much advantage to specifying the * or those other details in every stanza of the YAML file.

jack-oquin commented 10 years ago

I recommend reserving the term "EBNF" for the meta-language widely used in syntax documentation.

Let's call our specific relationship something like the "ROCON name hierarchy". Other suggestions are welcome.

EBNF could be used to describe the identifier hierarchy, but we have chosen to use YAML syntax, because that is better for our purposes. We may still want to use EBNF or regexp to specify the lexical elements (see: lexeme) for a tokenization step.

stonier commented 10 years ago

+1 to moving the wildcards out of the yaml syntax...done in https://github.com/robotics-in-concert/rocon_tools/commit/0ce96ece24a945ec90a1f870ffdd9ac2ef752109

stonier commented 10 years ago

Sometimes have to work hard to translate your lexicon of lexemes Jack! It keeps me entertained but at 2am looking at irregular programming characters like ~ is only going to have one effect...

I'll pick up on this again tomorrow.

jack-oquin commented 10 years ago

Please forgive the fancy jargon. I worked briefly in programming languages long ago.

The concept is simple: it is often helpful to detect and label lexical elements like identifiers ([a-zA-Z][a-zA-z0-9_]*) and punctuation ([:/*|.()#]) before looking deeper into the meaning. Regular expressions are simple and efficient for scans like that.

stonier commented 10 years ago

Nothing to forgive Jack. For me they are informational doorways to places which would take me quite some time to find in my own explorations.

I still prefer .* instead of * for "matches any", because the Kleene star is too powerful a concept to sacrifice just to match trivial patterns.

Not sure on this one. .* in regex world literally means anything. The driving concept for our fields isn't to literally imply absolutely anything but anything from that finite list represented in the yaml. In that case * feels like a more analogous fit given its use to list contents of a filesystem. I get your point though - that symbol itself is more useful in other ways. Are there other symbols more appropriate, or can a standalone star be discriminated for this purpose (it actually is working like that now - the yaml is being pushed into a set of ebnf rules which accomodate both the standalone star meaning of our own design as well as the kleene star meaning in the parsing logic).

We should specify that an empty field also matches any valid value, and that elided fields are empty.

Definitely - these are both currently implemented.

Even with the name hierarchy explicitly listed, I would still like to specify the valid characters using EBNF or a regexp. For example, we probably want to restrict identifiers to strings like ROS package names or Python identifiers. That is easy to do, and I'll volunteer to submit a definition some time soon. One reason to specify the valid identifiers explicitly is to leave us free to add more complex patterns in the future, if they prove useful.

Not sure I closely follow this, or even loosely. Can you expand? I'm not sure what your exact definition of 'identifier' is here, and I think that when you mention 'valid characters' you are talking about the parsing specific characters (like our |). In part, are you suggesting that we formalising our parsing specific characters on a restricted set of the ebnf symbles?

I recommend reserving the term "EBNF" for the meta-language widely used in syntax documentation. Let's call our specific relationship something like the "ROCON name hierarchy". Other suggestions are welcome. EBNF could be used to describe the identifier hierarchy, but we have chosen to use YAML syntax, because that is better for our purposes. We may still want to use EBNF or regexp to specify the lexical elements (see: lexeme) for a tokenization step.

Yes, I'd planned on having a document explain all of this. And you're correct - I was too loosely using the term ebnf simply because I was pushing this yaml embedded information into an ebnf parser. The user of rocon uri's though, doesn't know this - all they will see is the limited set of operators and definitions we expose (even if we do continue using a full ebnf rule parser under the hood).

Probably be good to start that documentation on the roswiki as soon as the doc indexer runs through rocon_tools and exposes the rocon_uri package.

jack-oquin commented 10 years ago

Even with the name hierarchy explicitly listed, I would still like to specify the valid characters using EBNF or a regexp. For example, we probably want to restrict identifiers to strings like ROS package names or Python identifiers. That is easy to do, and I'll volunteer to submit a definition some time soon. One reason to specify the valid identifiers explicitly is to leave us free to add more complex patterns in the future, if they prove useful.

Not sure I closely follow this, or even loosely. Can you expand? I'm not sure what your exact definition of 'identifier' is here, and I think that when you mention 'valid characters' you are talking about the parsing specific characters (like our |). In part, are you suggesting that we formalising our parsing specific characters on a restricted set of the ebnf symbles?

Yes. Loosely based on the EBNF symbols example, I would define an identifier as they did:

identifier = letter , { letter | digit | "_" } ;

Of course, unlike their definition, lower-case letters would also be permitted. Instead of a long list of symbols, I am suggesting a shorter list, like this:

symbol = ":" | "/" | "*" | "(" | ")" | "#" | "|" | "."  ;

The main reason is to reserve other characters for possible future use.

Based on the URI generic syntax in Wikipedia, which we should verify with more official sources, a ROCON resource name could be defined somewhat like this at a high level:

resource = 'rocon', ':', description ;
description = [ '//', concert ], { '/', identifier } ;
request = 'rocon', ':', matches, '#', rapp ;
matches =  [ '//', concert ], { '/', pattern } ;
pattern = identifier | '(', pattern, ')' | pattern, '|', pattern | '*' ;  
rapp = [ package, '/' ], identifier ;

That is not exactly right, but perhaps gives a general idea. I am not sure the parentheses are needed in a pattern, but include them as an example. It is a design choice whether to explicitly list all the components of a description at the top level, or handle it elsewhere.

Regarding the use of *'', why not just use the identifier 'any'** for that purpose (along with an empty pattern)?

There seem to be several incompatible versions of EBNF floating around. The github markdown highlighter differs from the Wikipedia definition, which is not even consistent within that page. Since the whole point is to be precise and unambiguous, I find that annoying.

stonier commented 10 years ago

Got it. I like the idea of expressing the different use cases - request, resource, interaction, rapp dependency.

I also thought of using 'any' but figured that was strange. Though if both of us thought of it maybe not. I implemented it here:

https://github.com/robotics-in-concert/rocon_tools/commit/3cad70f34bf988c3bce941ddcae301f476fd8519

but reverted it back to '' for now. Looking at rocon:///pc/any/hydro/precise|quantal|raring as a rocon interaction filtering requirement is far less instantly recognisable in comparison to `rocon:///pc//hydro/precise|quantal|raring`.

I think we can make use of a standalone '' and still make use of the \ in more powerful ways (currently already doing so under the hood). Even though this may risk the ire of the computer science fraternity, I think this is more user friendly for people who'll use these.

jack-oquin commented 10 years ago

The '*' does stand out better visually as a pattern to match.

As you say, a leading '*' can be handled as a special case, because there is no preceding element to be repeated. I suppose '**' would then signify zero or more instances of a pattern that matches anything.

jack-oquin commented 10 years ago

In the proposed scheduler_msgs/Resource, I suggest changing the name field to rapp.

I don't really think of the uri as a "hint". Given recent discussions, I would describe "description" and "request" URI's as separate but syntactically related object types, with the "description" functioning as a response to a "request".

When we integrate these changes, I would also like to add CurrentStatus and KnownResources to the new scheduler_msgs branch, except that the CurrentStatus message should use the same uri field name as the Resources message.

I'll submit a pull request to your platform_tuple_overhaul branch, so you can see exactly what I am suggesting.

robotics-in-concert / rocon

Rocon Resource Universal Identifiers [Platform Tuples] #7