Length of Encoding Vector

Hi! I'm trying to tie the codebase back to the paper and am a little confused by the length of the segmentation encoding vector. According to Fig 1. from your paper (attached here), the segmentation encoding vector is of length n, where n is the number of frames. Does this mean that n (and the length of the encoding vector) would be different, depending on the duration of audio samples?

When I was trying to reproduce your results, tsme's fit transform requires vectors of the same size. So I just chopped off longer vectors to make sure they are all the same size. Which works.

But in your demo ipynb file, there doesn't seem to be any chopping performed on the encoding vectors. So perhaps there's a better solution that I may have missed?

Thank you in advance! Great work!

josebeo2016 / biosegment

Length of Encoding Vector #3