RoboCupAtHome / RuleBook

Rulebook for RoboCup @Home 2024
https://robocupathome.github.io/RuleBook/
Other
147 stars 61 forks source link

Proposal: Integrate Qualitative Survey Metrics into RoboCup@Home Scoring System #827

Open juandpenan opened 11 months ago

juandpenan commented 11 months ago

Is your idea/suggestion related to a problem? Please describe.

Describe the solution you'd like

I propose introducing a qualitative metric score to the competition, which will be derived from a survey inspired by previous research. This survey consists of 17 questions, assessing the perceived social intelligence of the robot's performance. A preliminary version of the survey is available here.

At the beginning of the survey, participants will indicate the team and task they are evaluating. Each competing team will be responsible to gather at least five individuals to complete the survey. These individuals must be presented to the referee prior to each test. Both referees and volunteers are permitted to participate in the survey.

The survey scores, which range from 16 to 80, are automatically compiled in this spreadsheet. The intent is to sum up these scores with the overall competition scores. To maintain the authenticity of responses, the survey will only be accessible during the task periods.

Describe alternatives you've considered

MatthijsBurgh commented 8 months ago

I am not sure we should introduce such subjective scoring.

When we do this, we should have the same set of people to score all teams.

johaq commented 8 months ago

So on the hand I'm very hesitant about introducing this kind of subjective scoring on the other hand I do agree that often the score does not feel reflective of a robot's performance. Maybe we could try this evaluation just as a test in Eindhoven to get a feel for how the scoring looks. Though we might not get an accurate picture because teams won't adapt their approach if they don't actually get scored on it.

hawkina commented 8 months ago

I like the general idea but scoring this is very subjective, but culture and the environment play a huge role in what we perceive as polite and what not, so giving points for this seems really difficult.

I do agree that the robot's cognitive tasks are currently not really scored at all and would like to change that, but am unsure on how this could be done. I've seen other competitions have presentations about their approaches, which we kind of have in the poster sessions, so maybe cognitive//knowledge tools could be evaluated there?

There could also be new challenges implemented to evaluate these aspects in particular.