huggingface / audio-transformers-course

The Hugging Face Course on Transformers for Audio
Apache License 2.0
329 stars 100 forks source link

Suggestion to add shape info in preprocessing #99

Open mishig25 opened 1 year ago

mishig25 commented 1 year ago

In the section about preprocessing, it would be useful to add type/shape information of data produced after pre processing the data.

Specifically, https://github.com/huggingface/audio-transformers-course/blob/ac81306fb8822fa8c4e2a43748be8ba31d8bb043/chapters/en/chapter1/preprocessing.mdx#L186 here it be very useful to add as a comment what is the type/shape of input_features. Is it 3d array of floats like [time, freq, ampl] ?

snehilsanyal commented 1 year ago

+1 to this. Either a comment or a cell with an output that shows the entries and type/shape of input_features, I encountered this and tried to incorporate a cell for better visualization in my notebooks. I also think, that if this is introduced in initial lessons, it would be of great help and relevance for the next lessons. I found a very useful link in the transformer docs on data preprocessing here which mentions how padding and truncation can vary the shape of input_features to make every audio sample of the same size. See the image below:

image

mishig25 commented 1 year ago

cc: @sanchit-gandhi