Make it explicit that users can just box the images and do not need to transcribe if they don't want to. Text might say: 'Step X: Draw boxes/circles around any photos, sketches or images you see. You can just box the images if you would prefer not to transcribe.'
Make it explicit that users can just box the images and do not need to transcribe if they don't want to. Text might say: 'Step X: Draw boxes/circles around any photos, sketches or images you see. You can just box the images if you would prefer not to transcribe.'