run single-predictor mel models

rbroc commented 2 years ago

fit separate models with re-extracted mel features to reconstruct tonotopic maps (not necessarily relevant for the paper, but as preliminary result for ohbm submission and to kick-start some audio analyses). Analysis should probably be set up as classification.

satra commented 2 years ago

@rbroc - this is a notoriously weird problem. to truly show it's effectiveness this would need test-retest reliability within subject. tonotopy is one of those things that i would suggest not to do at the group or dataset level. one can try, but as far as i know group maps are not good predictors of individual tonotopy.

rbroc commented 2 years ago

@satra - it would be great to have a chat on this. We wanted to try exploring the idea, maybe for an ohbm submission (it's in 2 weeks so rather unrealistic, but maybe we could get to some submittable preliminary stuff), but we are not really sure how to operationalize this.

We can get subject-level (or even run-level?) beta maps for each mel band (we're extracting 64 mel bands in 20Hz to 8000Hz range) but we don't have a clear idea yet of to get from there to subject-level tonotopic maps.

GLM contrasts can't do the trick - when we put mel features in the same model shared variance makes the resulting estimates pretty weird - so we're thinking about estimating separate models for each feature. We were brainstorming on whether there was a good way to frame this as a classification problem using subject-level maps as inputs - but haven't gotten far yet.

If you have input on that let us know - or if maybe @jsmentch wants to do some magic playing with subject-level maps we have some compute to estimate the models for a few datasets and we could share the maps :)

jsmentch commented 2 years ago

Agree with @satra that this would be hard to get at the group/dataset level because of the inter-individual differences. But that said, the main thing we are looking for is just the Hi -> Lo -> Hi pattern, so maybe fewer bins would be easier to deal with. I tried this at some point with encoding models at the individual level and a similar number of bins but didn't have much luck, tho was only using Merlin data (so not a very long movie).

Hard to say exactly how best to deal with subject-level maps of the individual bins will keep thinking abt this. Can't remember if anyone tried a simple High vs Low contrast? (take the top 32 vs bottom 32)

rbroc commented 2 years ago

We haven't tried low vs. high but will do asap. Since we need to re-extract mel features anyway, would you then suggest to rather go for 32 (or even lower?) - that'd be neuroscout-wide, not specific to this project.

jsmentch commented 2 years ago

I think the finer-grained like 64 bands could be useful for other things or more complicated models. And I believe they could be roughly averaged with each other instead of re-extracting if easier, eg avg of top 32 > avg of bottom 32 (correct me anyone if wrong there!)

but specifically for tonotopy, just looked again thru the lit how people have done this with traditional stimuli 14 pure tones in half-octave steps 88-8000Hz (Da Costa et al 2011) 6 bands, center frequencies 200-6400 Hz (Humphries et al 2010, Norman-Haignere et al 2013, kell et al) 6 tones 0.25, 0.5, 1, 2, 4, and 8 kHz (Koops et al 2020) 8 tones 200-8000Hz (Schönwiesner et al 2014)

so if re-extracting specifically for tonotopy, and going with that range 20-8000, 10 bands is my guess of a reasonable number. (the 2 lowest bands will be pretty low frequency, maybe not the most useful but ok to have)

rbroc commented 2 years ago

Thanks a lot jeff - super useful. I think we'll keep it maximal for now then (64 bands) and, for tonotopy, see if contrasts between low vs. high 32 splits yields any results at all. More soon!

PeerHerholz commented 2 years ago

Hi folks,

sorry for being late to the party, basically +1 on everything mentioned by @satra & @jsmentch.

Regarding the classification problem approach: IIRC Schoenwiesner et al. (2015) used a classification approach, specifically a multiclass SVM to get A1, the idea being that the region that shows the highest classification accuracies regarding the distinction between frequency should be A1. They then used different methods, centroid estimation and rounded exponential function, to estimate the response curve of voxels to the given frequencies as this would produce the tonotopic maps. I.e. the classification was "only" used to get an estimate of A1 within which then voxel response curves were estimated. Given we get the respective single participants maps for the different frequencies I think that this approach could work really well. As mentioned by @satra and @jsmentch, the group level map might be harder to come by, maybe one based on probabilities could work.

I think other options to address this would maybe have to entail encoding models and subsequent PRFs to get the tonotopic maps. Speaking of which: it will be very interesting to see if we get the classical or orthogonal interpretation!

adelavega commented 2 years ago

I'm going to close this issue since this is not relevant to this paper, but we can move this issue to another repo if we ever have bandwidth for this

neuroscout / neuroscout-paper

run single-predictor mel models #49