robmsmt / KerasDeepSpeech

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
GNU Affero General Public License v3.0
242 stars 79 forks source link

could you give some examples about the shape in below? #5

Open moses1994 opened 6 years ago

moses1994 commented 6 years ago

3. input_length (required for CTC loss)

    # this is the time dimension of CTC (batch x time x mfcc)
    #input_length = np.array([get_xsize(mfcc) for mfcc in X_data])
    input_length = np.array(x_val)
    # print("3. input_length shape:", input_length.shape)   
    # print("3. input_length =", input_length)
    assert(input_length.shape == (self.batch_size,))

    # 4. label_length (required for CTC loss)
    # this is the length of the number of label of a sequence
    #label_length = np.array([len(l) for l in labels])
    label_length = np.array(y_val)
    # print("4. label_length shape:", label_length.shape)
    # print("4. label_length =", label_length)
    assert(label_length.shape == (self.batch_size,))

hi, I want to make a ctc demo, I do not know the "label_length.shape" and "input_length.shape", how to calculate them ? and what means them ? thanks you.

revive commented 6 years ago

@moses1994 The shape is a member of numpy.array, which is a tuple representing the dimension of the array. shape of (2, 3) means a 2-dimentional matrix of 2x3. In this code, the label_length is an 1-dimensional array, and each element is the length of the transcript in the batch. So it's shape is (batch,). You don't need to calculate the shape of an array.