questions about your data preprocessing

theseventhflow commented 4 years ago

Hi, your work is really amazing! I have some questions about the def prepare_data() function in line 252. `
# abandon the last

for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2): data = pairs[i:i+self.group_size] if len(data) == self.group_size: segments.append(data) `

why you abandon the last pairs_elements in one midi file? Will it improve the final result?
In the for loop, the third parameter is self.group_size*2. It means you will skip 5 pairs each loop in one midi file. For example, if a "pairs" variable has shape [30,2,512], only the pairs[0:5], pairs[10:15]and pairs[20:25]will be added to the final "segments" variable. Could you tell me why you not feeded the whole pairs to "segment"?

Thank you!

remyhuang commented 4 years ago

Hi,

Sure. It is feasible if you want to keep all the possible pairs/groups to the training data. Generally speaking (in my personal opinion), if you use more training data, you can get better final results. Although I think the data I have not used is only a small fraction of the training data, you can still try it by yourself!

Thanks, Remy

theseventhflow commented 4 years ago

Hi, thanks for your reply. I will try to feed more data.

DenisSKK commented 5 months ago

Sorry for asking this years later but. Do you know which parameter should be changed to feed more data? Will changing group_size from 5 to for example 4 feed more data?

YatingMusic / remi

questions about your data preprocessing #14