Restaurant command standardisation

ARTenshi commented 4 years ago

We should clearly define what a complete task is and how a command should be given. This year, a Team asked the user if he wanted some drinks, after taking the order, the robot asked "Do you want something else?" and the answer was "No" [we should note that the robot has two hands and is able to bring two items at the same time]. Then, the robot brought the drink and searched for a new user's request. The same user raised his hand and asked now for some food, the robot completed the order and waited for a new user's request. At this point, the robot was given half of the points on each order because a complete order consists of bringing food and or drink to a user as "Orders have between one and three objects randomly chosen." In this particular case, the order consisted on food and drink but the user, when asked "Do you want something else?" answered "No", then, the user raised his hand a second time to have the order completed. It was argued that, due to cultural differences, when asked "Do you want something else?" the user was thinking on some more drinks, and he was expecting to be asked, "Do you want something to eat?". This might be possible and we should highlight/standardise this in the rulebook.

However, the rulebook establishes that: "Robots can choose to take several orders and place them later on, place an order and pick the next one while the former is being served, or dispatch one order at a time." It doesn't mention that a user can decide that, so if the order was partially completed, we should regulate that the user doesn't raise his/her hand a second time to have his/her order completed and rather a second user should raise his/her hand. When orders are partially completed, the same user shouldn't raise his/her hand unless a new order is going to be placed and granted all points for that new order if completed. If not, false/wrong-object deliveries can be considered similar and a user that received a false/wrong-object delivery would be allowed to raise his/her hand until the user considers that his/her order has been completed correctly. We shouldn't forget that it is a competition and some basic regulations should be established for fair evaluation among teams.

YuqianJiang commented 4 years ago

I agree that in this particular example, the rules were ambiguous. Generally I think interactions in the Restaurant test are supposed to be open-ended, and we should not over-standardize what users/robots say.

One way to address this issue is that the referee and users can agree on the orders before each test, and score based on those orders. If a user ended up with all the items in his/her order, then that one order is considered complete. The argument is that regular restaurant customers only care about getting the items they want. In past tests, the rules did not discourage unnatural behaviors, and it became ambiguous what each order was. When speech recognition failed on some items, users just kept changing the order until the robot understood. When robots could not recognize the waving of most users, the recognizable users made multiple orders to use all the test time.

If we keep the current rules (users are allowed to come up with more orders), I think it's reasonable to add something like what @ARTenshi suggested:

When orders are partially completed, the same user shouldn't raise his/her hand unless a new order is going to be placed and granted all points for that new order if completed.

johaq commented 4 years ago

I do not understand. You completed one customers entire order in two runs which is allowed in the rules and you received points for one full order (is this what you mean with half points for two orders?). I don't see the problem with this. In a restaurant I often order drinks first and then food later on.

ARTenshi commented 4 years ago

The problem here is that the person decided that his/her order was not completed, not the robot. If the robot was going to complete the order in two runs it should approach the person itself without a waving request (or whatever the strategy the team decides). If the robot misunderstand the order, partially listen to the order, drops an object and performs a false delivery, or whatever a user may consider an incomplete performance of the order, he/she shouldn't be allowed to raise his/her hand to have the order completed as desired (partial orders are penalised anyways) once the robot announces that the order (as the robot understood/performed it) was completed and announces that it is looking for a new request (unless, of course, a full new order is going to be placed by the same user).

johaq commented 4 years ago

Ok. This sounds to me like an unlucky misunderstanding which unfortunately happen. I think the point of the restaurant task is to have natural interactions as much as possible since it is outside the arena setup and involves interacting with lay person. I would be ok with adding something like "the customer will state their entire order once asked" but not a command generator a la GPSR. Do you want smth like the former or an entire generator?

kyordhel commented 4 years ago

Guys, stop thinking for a moment like specialists in Robotics and think like a the owner of a restaurant that just purchased a Pepper to act as waiter.

It is clear that this naive soul expects the robot to naturally interact with all customers who have no training whatsoever in the same way a human being would do. It is in the spirit of the test to use a real restaurant with real customers and the robot should be able to fetch their orders (has happened in the past). I don't think that adding a generator or standardizing the test is in the right direction (@justinhart you're also in HRI and NLP, so please step in).

Being a waiter doesn't need huge brains and even people with limited use of the language can do the job. If you want to include additional materials like a printed menu (which has never been done before but is not against the rules) on which the client can pick by pointing or choosing a menu number, do it. If you want to buff the rules to make it easier to score, do it. What is not allowed is to constrain the interaction.

ARTenshi commented 4 years ago

This issue is more on when to mark the start and end of an order. Start: The robot announces that it is searching for a request and approaches to a user who called it. End: The order is completed and/or the robot says that the order has been completed.

What happens in between is open to the team.

The reason to raise this issue is that the same user should not be able to call the robot again to complete any unsuccessful or partially successful order once the robot announces that it considers the order completed and that is looking for a new user (extreme case may be unhappy clients that won't let the robot to attend any other costumer until they are completely satisfied).

justinhart commented 4 years ago

As a principle of how we should address rules changes, we shouldn't make rules around one thing that happened one time to one team except in extreme cases.
Following this, it is easy for the team in question to change their software to address this specific scenario. The teams can react to this in the way that they structure the dialog that the robot says. "Do you want any more drinks? Would you like any food?" If the team had said, "Does that complete your order?" then the command would have been unambiguous.
I think that we can call this "unfortunate" but that any rule change specifically dictating how the judges are to behave will lead to state-machine style interactions. If the robot is programmed with a next step dictating how the person should behave, then why not get rid of the person altogether and simply turn this into a pick-and-place task of sticking food items onto the tables?

ARTenshi commented 4 years ago

Minimum regulations should be added for fair evaluation, otherwise I would call Restaurant a Open Demo instead and evaluate accordingly.

When an order execution start and when it ends seems a reasonable rule for objective evaluation. At the end, learning from past events is called experience.

justinhart commented 4 years ago

At the end, learning from past events is called experience.

Yes, but the team that got burned could also learn from this experience and simply change their dialog.

ARTenshi commented 4 years ago

That's true. In summary, the proposal is that once a robot announces (in whichever way the team decides) that the order has been completed and it is looking for a new user/order, the previous user is not able to request the same order if it has not been completed (it will be penalised if it hasn't been completed).

If the robot clearly asks for a new request and the same user raises his/her hand to request part of the previous order to be completed, there is no way that the robot (or a human) knows that it is not a new order but the previous order repeated without asking questions that may be time-consuming and then decide that it's better to attend a full new order than complete the previous one.

I mean, a robot can easily announce that "I will wait for a new user. If a previous user is unhappy with their previous order, please refrain to ask me it again as I considerer that order completed. Sorry for the inconveniences.", but it's unnatural; however, if most of the TC considers it the right path for fair evaluation, then I won't have more comments on this.

justinhart commented 4 years ago

The other refs raise their hands to provide many opportunities for the robot to take new orders, right? It gives several locations for the robot to see the hands waving, so that tends to be for the team's benefit.

So now the question becomes, whether or not it was reasonable to believe that the order was complete. I'd say that it was reasonable to expect that the order was complete, and that maybe the team should have been given full points. That seems like a judgment call.

The question at hand is not whether what happened was fair. It's whether a specific rule should be written to address a thing that happened a single time in a competition.

Please don't appeal to the absurd. I'm not saying that you need to go through a lengthy dialog in order to establish whether or not the order is complete. Simply asking, "Does that complete your order?" before driving away would satisfy the criteria.

ARTenshi commented 4 years ago

I am just saying that if the robot asks for a new request and a user raises his/her hand, a robot should expect that the user will place a new order. Otherwise, we should add in the rulebook that a user can raise his/her hand to ask for his/her order to be completed satisfactorily and teams should be aware of that.

justinhart commented 4 years ago

I think that that's fair, but what does that mean about the state of the previous order? If the previous order is incomplete, then the result is the same, even if the new order is considered to be a new order. Otherwise, you have to consider the previous order to have been a complete order, but if the ref didn't get out all of the items that they intended to, then you end up with the same conundrum as before.

ARTenshi commented 4 years ago

Exactly my point, when a referee should consider an order completed? When it has been taken by the robot (all the items), executed (partially or fully), or when the robot announces that it has been completed (can't complete) and it's waiting for a new order. That's the only part on command standardisation I aimed to discuss, the HRI in between depends on each team's strategy.

justinhart commented 4 years ago

It's been left to a judgement issue in the past. I would just be careful that whatever rule is made doesn't reduce the HRI to another state machine. In the past, the rules have provided too much structure defining interactions, thus enabling teams to bypass the need to do real HRI.

I would be happy with any ruling that does not provide teams with a way to basically just guess the ref's intent based on the rule itself.

ARTenshi commented 4 years ago

Totally agree on that, it is especially interesting in unstructured environments like those prosed in the Restaurant task. However, I consider that a clear mark between orders will help to evaluate this task better. I would suggest that when a robot announces that it has completed the order (or can not complete the order) and that it is waiting for a new order, a new command is given. We can make it optional like:

Suggested new rule: "If the robot announces that the current order has been completed (or can not be completed) and it is waiting for a new user, a new command will be provided while the previous order will be considered finished and the robot can not go back to that order."

I think that it doesn't prevent a robot from asking the current user, before announcing the end of the command, "Do you want something else?" (depending on each team's strategy and HRI) to take the missing parts of the order, if any, while it reduces ambiguity between orders.

justinhart commented 4 years ago

That sounds fine to me.

kyordhel commented 4 years ago

I just want to stress that restaurant is a Stage II test, (arguably) the most difficult in the whole rulebook because:

Occurs in the wild
Anyone can interact with the robot

This means: no rules, just bloodshed. A successful waiter takes and delivers orders. Advanced features include cleaning tables, issuing bills/notes, and charging customers, as well as answering questions about the menu.

Our test is about fetching and delivering orders. If I get my cheeseburger with fries and diet coke, the robot scores, otherwise, I call the manager. I'm not supposed to read a manual for that.

johaq commented 4 years ago

I'm with @kyordhel on this. I think TC/OC should give instructions to volunteers like:

This is how you can get the robot's attention
This is what you are supposed to order

Everything else is up to the robot.

ARTenshi commented 4 years ago

That's not under discussion and it doesn't change. Just to put in the rulebook that "2. This is what you are supposed to order" includes requesting the same order by an unsatisfied user after the robot announced that that user's order was completed and is waiting for a new order or not.

Something like: "Unhappy users are able to raise their hand and ask for their order to be completed successfully." So teams can design a strategy that might include avoiding to revisit previous users because the probability of receiving a new order and an old order is the same.

ARTenshi commented 4 years ago

Well, now that we know that this situation might happen, the question becomes: Should we prevent it to happen? Should we allow it to happen but make it clear in the rulebook? Should we allow it to happen and only let (at least) the teams following this issue to act in consequence?

johaq commented 4 years ago

I think at this point you need to make a concrete suggestion what you'd add to the rule book (PR probably). I think it was clear we do not want a command grammar. If you think you can add something to the task description that makes refereeing easier and clearer, we can discuss that in a PR.

ARTenshi commented 4 years ago

Ok. Again, it is not about how the robot asks or receives a command, that depends on each team, but when a user is able to repeat an old or give a new command. As mentioned before, the proposals are:

Suggested 1: "If the robot announces that the current order has been completed (or that can not be completed) and that it is waiting for a new user, a new command will be provided while the previous order will be considered finished and neither the robot nor the user can go back to that order."

Suggested 2: "Unhappy users are able to raise their hand and ask for their order to be completed successfully when the robot is looking for new users/orders."

Suggested 3: Nothing. *If it is "nothing", case two is the unspoken rule.

johaq commented 4 years ago

No command standardization. Robot needs to guide the dialogue.

RoboCupAtHome / RuleBook

Restaurant command standardisation #712