jaakkopasanen / ABX

Web app for AB and ABX listening tests
MIT License
53 stars 1 forks source link

JND tests #8

Open turian opened 3 years ago

turian commented 3 years ago

I would also like to run JND tests: Are audio A and B the same or different audio?

jaakkopasanen commented 3 years ago

Thanks for the idea! How would this work in practice? ABX test already answers if the listener can distinquish between the two samples but I'm not sure how just-noticeable difference test would be administered in audio. I suppose that would be possible already by creating multiple ABX tests in the same test suite with increasing difference. Other than that I think this might require some interactive elements.

turian commented 3 years ago

You would simply present audio A and B and ask them to click "same" or "different".

The annotation instructions are important here, in our experiments.

jaakkopasanen commented 3 years ago

What's the benefit over ABX test? I would imagine you don't want to trust the test subject's own evaluation but to run ABX test which tells if the listener can tell a difference or not. With audio it's extremely easy to imagine differences where there are non. In fact the whole Hifi industry is based on this fallacy.

turian commented 3 years ago

Yes agree on the hifi industry.

ABX tests and 4IAX tests can reduce bias, but also require repeated labeling because there is a 50% chance of choosing the correct answer.

We are doing JND experiments where: a) We discard labels by annotators who don't match a secret gold set. b) After this annotator filtering, we have done measurements to see that FPs = FNs, so there is no systematic bias towards hearing a difference. c) We are using machine learning to detect label noise and interannotator bias.

Also, we are comparing JND to 4IAX to see which is cheaper per label, factoring in the cost for repeated labeling, and time.

It would be useful to be able to use your package to do our JND tests, with a "use this at your own experimental methodology risk" info in the doc.

jaakkopasanen commented 3 years ago

That's very interesting. Am I correct in assuming the test would have two audio samples and the app randomly picks two, so it can be A vs A, A vs B, B vs A or B vs B?

turian commented 3 years ago

Actually it's AB vs BA.

If you want to include AA or BB (we sometimes do that for control), the annotation CSV/YAML has that.