microsoft / P.808

This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Amazon Mechanical Turk as the crowdsourcing platform. It includes implementations for Absolute Category Rating (ACR), Degradation Category Rating (DCR), and Comparison Category Rating (CCR).
MIT License
210 stars 58 forks source link

Balanced block design gets reshuffled in ccr test. #64

Open joelsprunger opened 1 year ago

joelsprunger commented 1 year ago

I went to run a ccr study with 20 conditions using "balanced block" design. It correctly created the number of rows with 20 columns and looked to be correctly putting the 20 conditions in each row. Then before it is finished creating the dataframe it shuffles everything. This returns the same result as "random" and should be fixed. I can fix this when I have time this week.

The offending line is 382 in create_input.py

random.shuffle(full_trappings)

joelsprunger commented 1 year ago

This line should only be run in the random design, but perhaps for balanced block you want to shuffle along the column axis. I will look into this this week.