NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.2k stars 2.54k forks source link

Sortformer Integration Release Inquiry #10491

Open caliber1313 opened 2 months ago

caliber1313 commented 2 months ago

Hello NeMo Team,

I’m just a student highly inspired by your work on speaker diarization. I’m particularly excited about Sortformer and its novel use of Sort Loss to address the permutation problem in speaker diarization.

I’m eager to use Sortformer in my at home research and would like to ask if there is an expected release date for it within the NVIDIA NeMo framework.

Thank you for your time and amazing contributions.

tango4j commented 2 months ago

Hi, Thank you for showing interest on our work. The model needs to go through approval processs and PR-reviews. I am planning to release everything by the end of October 2024, but this could be delayed. I will reply to this thread as soon as we release the model. Thank you.

PhamDangNguyen commented 1 month ago

hi

1998karen commented 1 week ago

Hello! Thank you for your work! Are there any updates? Will the Sortformer code be published?

KeystoneScience commented 1 day ago

Hey, checking in on this as well, are there any updates with the code release for experimentation?