cloudyr / MTurkR

R Client for the MTurk Requester API
https://cloud.r-project.org/package=MTurkR
91 stars 18 forks source link

Answer results are lost for multiple Selections in SelectionAnswers #63

Closed harmsk closed 9 years ago

harmsk commented 9 years ago

If I have a question that allows for multiple selections within a SelectionAnswer. mTurk will return the results deliminted by: |. For example: A|B|C. However if I use GetAssignments with MTurkR only the first Selection is returned. Example: A.

leeper commented 9 years ago

Thanks for this. I will investigate.

UPDATE: Confirmed with the following toy example:

question <- "<QuestionForm xmlns=\"http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd\">\n  <Question>\n    <QuestionIdentifier>question1</QuestionIdentifier>\n    <IsRequired>true</IsRequired>\n    <QuestionContent>\n      <Text>Are you registered to vote?</Text>\n    </QuestionContent>\n    <AnswerSpecification>\n      <SelectionAnswer>\n        <StyleSuggestion>checkbox</StyleSuggestion>\n        <Selections>\n          <Selection>\n            <SelectionIdentifier>1</SelectionIdentifier>\n            <Text>Yes</Text>\n          </Selection>\n          <Selection>\n            <SelectionIdentifier>2</SelectionIdentifier>\n            <Text>No</Text>\n          </Selection>\n        </Selections>\n      </SelectionAnswer>\n    </AnswerSpecification>\n  </Question></QuestionForm>"
CreateHIT(title = "test", description = "test", reward = 0.01, expiration=seconds(1), duration=seconds(1), question = question)

The response is as follows:

GetAssignments(hit = h$HITId)
                    AssignmentId       WorkerId                          HITId AssignmentStatus     AutoApprovalTime           AcceptTime           SubmitTime
1 3S4AW7T80BIAKTD2I5SGFBR76L0L4A A1RO9UJNWXMU65 3BVS8WK9Q0VPS7SBME0O6P92J4OIBZ        Submitted 2015-02-02T09:55:00Z 2015-01-03T09:54:54Z 2015-01-03T09:55:00Z
  ApprovalTime RejectionTime RequesterFeedback ApprovalRejectionTime SecondsOnHIT question1
1         <NA>          <NA>              <NA>                    NA            6         1

It's simply not parsing the second SelectionIdentifier field in the response, which means it's a problem in as.data.frame.QuestionFormAnswers:

<?xml version="1.0"?>
<GetAssignmentsForHITResponse><OperationRequest><RequestId>fcf15a5c-dc1b-464b-a368-8e7cd5f8dad3</RequestId></OperationRequest><GetAssignmentsForHITResult><Request><IsValid>True</IsValid></Request><NumResults>1</NumResults><TotalNumResults>1</TotalNumResults><PageNumber>1</PageNumber><Assignment><AssignmentId>3S4AW7T80BIAKTD2I5SGFBR76L0L4A</AssignmentId><WorkerId>A1RO9UJNWXMU65</WorkerId><HITId>3BVS8WK9Q0VPS7SBME0O6P92J4OIBZ</HITId><AssignmentStatus>Submitted</AssignmentStatus><AutoApprovalTime>2015-02-02T09:55:00Z</AutoApprovalTime><AcceptTime>2015-01-03T09:54:54Z</AcceptTime><SubmitTime>2015-01-03T09:55:00Z</SubmitTime><Answer>&lt;?xml version="1.0" encoding="UTF-8" standalone="no"?&gt;
&lt;QuestionFormAnswers xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionFormAnswers.xsd"&gt;
&lt;Answer&gt;
&lt;QuestionIdentifier&gt;question1&lt;/QuestionIdentifier&gt;
&lt;SelectionIdentifier&gt;1&lt;/SelectionIdentifier&gt;
&lt;SelectionIdentifier&gt;2&lt;/SelectionIdentifier&gt;
&lt;/Answer&gt;
&lt;/QuestionFormAnswers&gt;
</Answer></Assignment></GetAssignmentsForHITResult></GetAssignmentsForHITResponse>
leeper commented 9 years ago

@harmsk I think this should now be fixed. Install the latest version from GitHub and let me know if you still encounter problems.

You should see something like:

GetAssignments(hit = "3BVS8WK9Q0VPS7SBME0O6P92J4OIBZ")
1 of 1 Assignments Retrieved
                    AssignmentId       WorkerId                          HITId AssignmentStatus     AutoApprovalTime           AcceptTime           SubmitTime
1 3S4AW7T80BIAKTD2I5SGFBR76L0L4A A1RO9UJNWXMU65 3BVS8WK9Q0VPS7SBME0O6P92J4OIBZ        Submitted 2015-02-02T09:55:00Z 2015-01-03T09:54:54Z 2015-01-03T09:55:00Z
  ApprovalTime RejectionTime RequesterFeedback ApprovalRejectionTime SecondsOnHIT question1
1         <NA>          <NA>              <NA>                    NA            6       1;2

With multiple selection answers semicolon-separated. This was the intended current behavior, but there was a problem that the code wasn't correctly identifying when there were multiple selections and thus only provided the first one (as you were seeing).

harmsk commented 9 years ago

Thank you. This works great.