It seems both pad=True and pad=False are not zero centered. When Pad=True, the first frame starts from -(winsz-hopsz)//2. instead of -winsz//2.
When using this model for audio at sample rate of 22.05kHz at a hop size of 256, the rounding in the time_to_samples causes the audio hop size to be inaccurate. causing the number of frames to be bigger/smaller than what the hopsize field indicates.
It seems both pad=True and pad=False are not zero centered. When Pad=True, the first frame starts from -(winsz-hopsz)//2. instead of -winsz//2.
When using this model for audio at sample rate of 22.05kHz at a hop size of 256, the rounding in the time_to_samples causes the audio hop size to be inaccurate. causing the number of frames to be bigger/smaller than what the hopsize field indicates.