capital-G / musikinformatik-sose2021

Course materials for Musikinformatik course SoSe 2021 at RSH Düsseldorf
https://capital-g.github.io/musikinformatik-sose2021/
6 stars 2 forks source link

Dim reduction to markov chain #50

Closed capital-G closed 3 years ago

capital-G commented 3 years ago

@telephon one remix of the markov version shown today is by calculating the PCA of a spectogram. For each vector representation of this we can calculate the distance to each vector resulting in a n x n distance matrix which can be used as transition matrix for a markov chain. The handing over part is done via csv.

Python

import numpy as np
import librosa
import librosa.display
import matplotlib.pyplot as plt
import soundfile
data, sr = librosa.load('chief.wav', sr=None, mono=True)
N_FFT = 10000
WIN_LENGTH = 10000
HOP_LENGTH = 10000
stft = librosa.stft(data, n_fft=N_FFT, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
spect = librosa.feature.melspectrogram(data, sr=sr, n_fft=N_FFT, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
plt.figure(figsize=(15, 10))
librosa.display.specshow(librosa.amplitude_to_db(spect, ref=np.max), y_axis='hz', x_axis='s')

output_6_1

from sklearn.manifold import TSNE
from sklearn.decomposition import PCA
tsne = PCA(n_components=10)
spect_2d = tsne.fit_transform(spect.T)
plt.scatter(x=spect_2d[:, 0], y=spect_2d[:, 1])

output_10_1

from scipy.spatial import distance_matrix
d = distance_matrix(spect_2d, spect_2d)
d = (-1)*d + d.max(axis=1)
import pandas as pd
pd.DataFrame(d).to_csv('foo.csv', index=False, header=False)
d.shape
(966, 966)

SuperCollider

s.boot;
-> localhost
b = Buffer.read(s, "/Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac");
-> Buffer(2, nil, nil, nil, /Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac)
b
-> Buffer(2, nil, nil, nil, /Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac)
SynthDef(\bplaySection, {|out, bufnum, start, end, rate=1.0, sustain=1.0, amp=0.1, attack=0.001|
    var sig, env;
    env = EnvGen.kr(Env.linen(
        attackTime: attack,
        sustainTime: (end-start)/BufSampleRate.kr(b),
        releaseTime: 0.001,
    ), doneAction: Done.freeSelf);
    sig = PlayBuf.ar(
        numChannels: 2,
        bufnum: b,
        rate: BufRateScale.kr(b) * rate,
        startPos: start,
    );
    sig = sig*env*amp;
    Out.ar(out, sig);
}).add;
-> a SynthDef
Synth(\bplaySection, [
    \bufnum, b,
    \start, 2000,
    \end, 40000,
]);
-> Synth('bplaySection' : 1121)
t = CSVFileReader.readInterpret("/Users/scheiba/github/musikinformatik_sose2021/fftkov/foo.csv")
-> [ [ 93228.044750679, 93227.989116289, 60061.208920211, 82305.4775241, 81328.446330002, 56999.69536449, 80908.997353898, 80240.290440599, 85232.945631371, 85866.381053318, 79157.063936866, 87463.297267627, 80287.238484691, 79686.530098955, 87165.471127, 75130.627055807, 82063.928728668, 82161.697344702, 70609.019307724, 83910.493911634, 77055.910753639, 75979.706831109, 80027.464007242, 73534.218585885, 81777.190208451, 71039.294910442, 75133.168903054, 81515.328022714, 52909.621543427, 77970.088271363, 8595...etc...
Tdef(\x, {
    var curState=0;
    var winSize = 10000;
    var hopSize = 10000;
    var sampleRate = 44100;
    loop {
        curState = (0..t.shape[0]).wchoose(t[curState].normalizeSum);
        Synth(\bplaySection, [
            \bufnum, b,
            \start, curState*hopSize,
            \end, curState*hopSize + winSize,
            \amp, 0.5,
            \attack, 0.1,
        ]);
        ((winSize/sampleRate)*0.2).wait;
    }
}).play;
-> Tdef('x')
.
-> CmdPeriod