chongxi commented 2 years ago

Clusterless decoding requires both software and FPGA update

1. It requires much more VQ for each electrode groups

These VQs are automatically genearted in clusterless decoding. kmeans can be used to implement clusterless decoding, which is to over-split single-units into many feature-units. In this case, kmeans units.

Software API are kept simple:

from spiketag.base import SPK, FET, CLU
spk = SPK()
spk.load_spkwav('./spk_wav.bin') 
spk_df = spk.sort(method='kmeans', n_comp=20)

chongxi commented 2 years ago

clusterless_vq_pass_V2.zip compiled using vivado 2021.2 and vitas hls 2021.2; This V2 version improved FPGA timing performance, now the FPGA-NSP output spk_wav.bin and fet.bin should perfectly match each other. According commit were made: https://github.com/chongxi/xike_hls_module/commit/fa8a63bd63e74377011810719c393af1c6837888 and https://github.com/chongxi/xillybus_spi/commit/f77a8b798a69fb15f1a1eeac5e9c8424093a742e

It requires more precision on feature computing (via PCA)

First change:

This version of FPGA-NSP uses following fixed-point rule for each single number:

13/32 bits (binpoint=13) for output (mua.bin, spk_wav.bin and fet.bin)
19/32 bits (binpoint=19) for fpga.scale and fpga.shift in PCA transformation inside the FPGA
7/8 bits (binpoint=7) for fpga.pca in PCA transformation

Note: binpoint=x means x bits were used to encode the fractional part of a fixed-point number.

Second change:

when constructing PCA transformation in FPGA: y = (np.dot(X,P) + shift)/scale changes to y = (np.dot(X,P) + shift)*scale That is why we need 19 bits now to encode the fractional part of the fpga.scale, as now the scaling factor use multiplication rather than dividing and it can go 0.00001 so it needs more precision.

In FPGA, this change has saved both area and computation time (now transformation step cost 930 nanoseconds for each spike)

These two changes are in both spiketag and vitis hls code.

chongxi commented 2 years ago

spiketag: https://github.com/chongxi/spiketag/commit/23adcdc5c8508ecd32f3563bdf919cbd26f73cb4

hls: https://github.com/chongxi/xike_hls_module/commit/6ac182200fa797d763791c769401ad5169ed4fb3

chongxi commented 2 years ago

Transformation Test Result:

The feature error is at most 0.000123 which is around 2**(-13). That is basically quantization error by using 13 bits to encode fractional part of the feature output.

Testing notebook: test_transformer.zip

chongxi / spiketag

Clusterless decoding (FPGA bit file attached) #71

Clusterless decoding requires both software and FPGA update

1. It requires much more VQ for each electrode groups

It requires more precision on feature computing (via PCA)

First change:

Second change:

Transformation Test Result: