hamzag95 / voice-classification

84 stars 36 forks source link

Input .png size #1

Closed toshikwa closed 5 years ago

toshikwa commented 6 years ago

Hello, I'm trying to train my own dataset with Japanese speakers. As I'm a beginner at programming I don't understand your source cords nicely. What does "if str(x.shape) == '(513, 800, 3)': " mean?? Is this an output size determined by sox operation??

Anyway, thank you very much!!