strangerattractor / Soundvision_PUBLIC

Max's long long journey into the unity world.
Other
12 stars 1 forks source link

Logarithmic representation of FFT #80

Closed chikashimiyama closed 4 years ago

chikashimiyama commented 4 years ago

@strangerattractor wrote:

right now the spectrum is not logarithmic, correct? Could you add an option to make it logarithmic, also could you add an option to focus on a specific range of the spectrum in more detail?

chikashimiyama commented 4 years ago

@strangerattractor

what I implemented is not a spectrum but a spectrogram. (spectrum is assigned to you #52 ) Do you want to have a logarithmic spectrogram?

The specific frequency range to be looked at

is a little bit ambiguous. if I select, e.g 500 - 1000 Hz range, should I map that range to the entire height of the texture or should I just trim that part?

strangerattractor commented 4 years ago

sorry, more specific:

I was talking about the waterfall demo scene. You use the spectrum array that you get from pd. It would be helpful if I could use this array in general and have the "looked at" frequency be exposed to edit in the Inspector, whenever I work with it. I imagine to use this array for a number of different visuals in the future.

if I select, e.g 500 - 1000 Hz range, should I map that range to the entire height of the texture or should I just trim that part?

Yes for the array, not for the texture. For the texture I think I'll be able to do this myself with shadergraph.

chikashimiyama commented 4 years ago

hmm ... it's kinda hard.

It's easy to implement something specifically for waterfall Visualizer but

It would be helpful if I could use this array in general and have the "looked at" frequency be exposed to edit in the Inspector, whenever I work with it. I imagine to use this array for a number of different visuals in the future

this is completely another request. If you want to reuse it it should be implemented differently

chikashimiyama commented 4 years ago

I would do the following. I will implement a script called FFT modifier. It refers to a FFT buffer of the specific channel, copies it, modifies it according to the setting and passes it onto another script that requires it.

strangerattractor commented 4 years ago

Yes that sounds great!

Chikashi Miyama notifications@github.com schrieb am So. 27. Okt. 2019 um 10:56:

I would do the following. I will implement a script called FFT modifier. It refers to a FFT buffer of the specific channel, copies it, modifies it according to the setting and passes it onto another script that requires it.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/chikashimiyama/Soundvision/issues/80?email_source=notifications&email_token=AMQQNRXY7HUWZEB4MMVHSBTQQVJVLA5CNFSM4JFN3N32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECK2SCI#issuecomment-546679049, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMQQNRUNAUZ44P5NARHTQ3TQQVJVLANCNFSM4JFN3N3Q .

chikashimiyama commented 4 years ago

@strangerattractor also what do you mean by logarithmic data. you mean the conversion from FFT magnitude to dB (Y-axis of the spectrum)? or logarithmic frequency on X-axis. It is not possible to calculate LFT (Logarithmic Fourier Transform) in PD, as far as I know.

strangerattractor commented 4 years ago

I think I was not sure what I meant yet... sorry... ääähm. So the problem I have in a lot of cases is, that that I can see a lot of details in the low frequencies but not in the high frequencies when I look at the graph. And that is due to the fact that high frequencies are closer together. So what I hoped was, that if I pull the output through a logarithmic function, I would be able to see more details in the high frequencies, or balanced to the lower frequencies. I was a bit quick here, I think I was just thinking logarithmic because one of the programmers here looked at it and said "ah it's not logarithmic".... Do you see what i mean? Sorry for the quick shot :-/ will think more first next time!

But I think an FFT modifier script should be very helpful.

chikashimiyama commented 4 years ago

OK, you are talking about LFT but the other way around. The resolution is lower in the low frequency but not high-frequency range. If FFT bin is 512 the first bin covers 44100 / 512 = 0-86 Hz. this is the extremely low resolution for the human being because 0 - 86 Hz means almost two octaves. But in the high-frequency range, 86 Hz difference is not audible for example the difference between 8000 Hz and 8086 Hz cannot be perceived by my ears.

Anyways, if you are talking about Logarithmic Fourier transfer. I'm not the right person to talk to, you need to talk to a more hard-core math guy.

We can also increase the overall resolution to say 4096 but it consumes 8 times resources overall.

chikashimiyama commented 4 years ago

But I think an FFT modifier script should be very helpful.

for just trimming the FFT array? I think you can do that directly in the visualizer. but if you think so, i will implement triming-only modifier.

strangerattractor commented 4 years ago

Ok i see your point. Let's not implement trimming modifier. But could we implement a means of making the resolution accessible and changeable? Via cuelist maybe? I can imagine that we run into a hot CPU if we dial the resolution overall higher.

chikashimiyama commented 4 years ago

But could we implement a means of making the resolution accessible and changeable? Via cuelist maybe?

Basically no, due to the limitation of Pd. the block object defines the size of FFT size you can change it dynamically but it is a huge change and might be YAGNI.

Capture

The easier way is to always use higher resolution and to disable the process when we don't use (I don't think you analyze spectrum of all 16 channels at the same time) and store which channel should be enabled in qlist.

higher resolution FFT has also another downside ... . higher frequency domain resolution means lower resolution in time domain, you get less number of FFT image per second.

If you want to show different FFT image per frame. you need 60 images per second. 1000ms / 60 frame =16 ms. and the current block size 1024 is around 23 ms. so it is already a good compromise between frequency domain resolution and time-domain resolution.

strangerattractor commented 4 years ago

If you want to show different FFT image per frame. you need 60 images per second. 1000ms / 60 frame =16 ms. and the current block size 1024 is around 23 ms. so it is already a good compromise between frequency domain resolution and time-domain resolution.

ok then let's drop this resolution change and I look deeper into what we have already.