JaeungHyun commented 2 years ago

🚀 Feature request

데이터셋을 생성해서 인퍼런스하니까 상당히 느린데

오디오 파일을 바로 librosa로 signal 추출해서 인퍼런스해서 결과만 받는 코드는 없을까요?

kospeech 에 inference 코드를 참고해봤는데 모델이 정상적으로 동작하지 않습니다.

Motivation

Your contribution

def transform_input(signal):
    melspectrogram = librosa.feature.melspectrogram(
            y=signal,
            sr=configs['audio']['sample_rate'],
            n_mels=configs['audio']['num_mels'],
            n_fft=n_fft,
            hop_length=hop_length,
        )
    melspectrogram = librosa.power_to_db(melspectrogram, ref=np.max)
    return melspectrogram
def parse_audio(filepath: str) -> Tensor:

    signal, sr = librosa.load(filepath, sr=None)
    signal = librosa.resample(signal, orig_sr=sr, target_sr=16000)
    feature = transform_input(signal)

    feature -= feature.mean()
    feature /= np.std(feature)

    feature = torch.FloatTensor(feature).transpose(0, 1)
    print(feature.shape)

    return feature

def inference(feature):
    with torch.no_grad():
        outputs = model(feature.unsqueeze(0), torch.Tensor([1]).to('cuda'))
    print(outputs)
    prediction = tokenizer.decode(outputs["predictions"].cpu().detach().numpy())
    print(prediction)

    return prediction

@app.post("/upload")
async def upload(file: UploadFile = File(...)):    
    filepath = save_data(file)

    # load file
    feature = parse_audio(filepath)

    feature = feature.to('cuda')

    prediction = inference(feature)
    os.remove(filepath)

    return {'prediction': prediction}

JaeungHyun commented 2 years ago

데이터로더로 구성해서 들어가는 인풋하고

제가 직접 전처리해서 나오는 인풋하고 같은 것 까지는 확인했는데

모델에 넣고 결과는 달라집니다.

JaeungHyun commented 2 years ago

해결했습니다 :)

rkskekzzz commented 2 years ago

안녕하세요! 혹시 어떤 학습 모델 사용하셨나요? 저는 rnn_transducer model을 사용해서 inference 코드를 만들고 있는데 결과가 잘 안나와서 문의드립니다!

JaeungHyun commented 2 years ago

@rkskekzzz

outputs = model(feature.unsqueeze(0), torch.Tensor([feature.shape[0]]).to('cuda'))

이렇게 하니까 결과가 제대로 나왔습니다

openspeech-team / openspeech

inference code #162

🚀 Feature request

Motivation

Your contribution