question about the semantic process

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

MIT License

2.33k stars 249 forks source link

question about the semantic process #186

Closed asr-pub closed 1 year ago

asr-pub commented 1 year ago

https://github.com/lucidrains/audiolm-pytorch/blob/9ec98366e1c376088f44269f10c615b1ec640b09/audiolm_pytorch/audiolm_pytorch.py#L143-L145C5

Hello, Why removing adjacent duplicate elements ?

lucidrains commented 1 year ago

this was actually recommended by @eonglints , who follows the field closely

lucidrains commented 1 year ago

@asr-pub maybe the durations should be preserved on a separate dimension for the network (like run-length encoding?) i'm actually confused myself why it is standard practice to do this

asr-pub commented 1 year ago

@asr-pub maybe the durations should be preserved on a separate dimension for the network (like run-length encoding?) i'm actually confused myself why it is standard practice to do this

Thank you for replying