pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.24k stars 773 forks source link

Paper at ICASSP? #219

Closed hbredin closed 5 years ago

hbredin commented 5 years ago

Dear pyannote.audio contributors (@yinruiqing @pkorshunov @V-assim @Mymoza @GregGovit @hadware @diego-fustes @MarvinLvn),

I am planning to write a paper describing pyannote.audio and submit it to ICASSP 2020.

As contributors to the project, it seems logical that you all appear as co-authors of the paper. Could you please let me know if that is OK with you?

The paper will be written on Overleaf and is currently living here: https://www.overleaf.com/read/vmgxwphkrsnh (read-only link)

If you would like to contribute to the paper in some way, please let me know. Any help will be appreciated. Submission deadline is October 21st, 2019.

Thanks again for your contribution to the project!

Hervé.

diego-fustes commented 5 years ago

Hi Hervé,

it would be a pleasure. A huge reward for my tiny contribution :)

My details are:

name: Diego Fustes affiliation: Sutherland Global Services email address: diegofustesfic@gmail.com

Let me know when you get a first version of the paper and I'll review it and help as much as I can.

Kind regards

El vie., 27 sept. 2019 a las 10:56, Hervé BREDIN (notifications@github.com) escribió:

Dear pyannote.audio contributors (@yinruiqing https://github.com/yinruiqing @pkorshunov https://github.com/pkorshunov @V-assim https://github.com/V-assim @Mymoza https://github.com/Mymoza @GregGovit https://github.com/GregGovit @hadware https://github.com/hadware @diego-fustes https://github.com/diego-fustes @MarvinLvn https://github.com/MarvinLvn),

I am planning to write a paper describing pyannote.audio and submit it to ICASSP 2020 https://2020.ieeeicassp.org/.

As contributors to the project, it seems logical that you all appear as co-authors of the paper. Could you please let me know if that is OK with you?

The paper will be written on Overleaf and is currently living here: https://www.overleaf.com/read/vmgxwphkrsnh (read-only link)

If you would like to contribute to the paper in some way, please let me know. Any help will be appreciated. Submission deadline is October 21st, 2019.

Thanks again for your contribution to the project!

Hervé.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pyannote/pyannote-audio/issues/219?email_source=notifications&email_token=ABLRLYJFGUTJW3UPMNQFQ2DQLXDDXA5CNFSM4I3DVY4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HODB5IQ, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLRLYMBXHLR3X5OG3HCYLTQLXDDXANCNFSM4I3DVY4A .

MarvinLvn commented 5 years ago

Hi everyone,

Awesome ! Really glad that pyannote will have its own paper !

I think it might be a good idea to take this opportunity to improve pyannote documentation. In particular, I think we should give/improve the following points : 1) A high-level overview of the pipeline. How everything works, how each module is articulated with the others, etc ... 2) More details about some mechanisms : I am thinking about user-defined callbacks since I've been working on that. But I'm pretty sure many of you will have other ideas 🙂 3) More explanations about some tricks that have been implemented : sliding window during the test, scheduler for the learning rate, data augmentation... 4) Some tips for users who might want to integrate their own code in pyannote (but that point goes well with the first one).

If each of us participate, we can write an extensive documentation that would potentially maximize users understanding of the pipeline, and would be a breeding ground for future pull requests.

I am aware that this does represent an extra workload, but if each of us does a tiny effort, I think we can greatly improve this aspect of pyannote.

I'd be glad to hear (read) your thoughts about that.

Marvin

hadware commented 5 years ago

I'd be delighted (if I had known that building a yml file would get me a paper contribution i'd do it more often :P ).

I'd also be willing to contribute a bit more, namely doing bit of "code-strengthening" and removing some quirks that I saw while working on some other things for JSALT (mostly those outlined by Pycharm, let's be honest) (namely, for instance, replacing the "old" properties by their decorated counterparts, it's more pythonic).

My references: name: Hadrien Titeux Lab: CoML team, DEC, université PSL (I actually don't know which one is the right one, i'll ask around) email: hadrien.titeux@ens.fr

wesbz commented 5 years ago

Hi, Thank you for your email and for counting me in. I would be glad to help more if you need. Regards, Wassim B.

Le ven. 27 sept. 2019 10:56, Hervé BREDIN notifications@github.com a écrit :

Dear pyannote.audio contributors (@yinruiqing https://github.com/yinruiqing @pkorshunov https://github.com/pkorshunov @V-assim https://github.com/V-assim @Mymoza https://github.com/Mymoza @GregGovit https://github.com/GregGovit @hadware https://github.com/hadware @diego-fustes https://github.com/diego-fustes @MarvinLvn https://github.com/MarvinLvn),

I am planning to write a paper describing pyannote.audio and submit it to ICASSP 2020 https://2020.ieeeicassp.org/.

As contributors to the project, it seems logical that you all appear as co-authors of the paper. Could you please let me know if that is OK with you?

The paper will be written on Overleaf and is currently living here: https://www.overleaf.com/read/vmgxwphkrsnh (read-only link)

If you would like to contribute to the paper in some way, please let me know. Any help will be appreciated. Submission deadline is October 21st, 2019.

Thanks again for your contribution to the project!

Hervé.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pyannote/pyannote-audio/issues/219?email_source=notifications&email_token=ABMSUYT25IXANJMGLG7OGI3QLXDDXA5CNFSM4I3DVY4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HODB5IQ, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMSUYTBUYMH37A2RYPMDCLQLXDDXANCNFSM4I3DVY4A .

hbredin commented 5 years ago

Thanks for your enthusiastic replies!

@MarvinLvn, I agree that documentation really needs some love. I just opened issue #220 to continue the discussion.

@hadware Feel free to clean my code :)

@diego-fustes I will ping this thread once a good enough draft is ready.

pkorshunov commented 5 years ago

Hi, sorry for being a bit late to the party. I thought to create some first version of CI. What do you think if I try to connect Travis to pyannote and make it run there. Also coveralls would be good and will encourage to write some unit tests :) This would be a good start I think.

hbredin commented 5 years ago

That would be great. Thanks @pkorshunov.

Do you have any idea how easy/difficult it would be to make CI update the online documentation for every new commit and/or new release?

pkorshunov commented 5 years ago

Do you mean the documentation generated by sphinx from the source code? This would not be very difficult. The problem we had in our toolbox Bob is to find a place where to host it. We used to host it in pythonhosted but now the recommended way is to use https://readthedocs.org/

hadware commented 5 years ago

I concur. readthedocs (IIRC) automatically re-builds the sphinx doc each time you push to the master, so no need for pushing the "rendered" doc each time you update it.

On Wed, Oct 2, 2019 at 11:39 AM Pavel Korshunov notifications@github.com wrote:

Do you mean the documentation generated by sphinx from the source code? This would not be very difficult. The problem we had in our toolbox Bob is to find a place where to host it. We used to host it in pythonhosted but now the recommended way is to use https://readthedocs.org/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pyannote/pyannote-audio/issues/219?email_source=notifications&email_token=ABJZFJ4PBM2F5AWUBHLE7GTQMRT4BA5CNFSM4I3DVY4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAEFNEY#issuecomment-537417363, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJZFJ2VUNS2P7NOA4OD4PLQMRT4BANCNFSM4I3DVY4A .

--

Hadrien Titeux

Machine Learning Engineer & Python enthousiast | CoML

Mobile: +33 (0)7 60 32 52 00

Mymoza commented 5 years ago

Oh, yes, I just saw this by browsing the repo.

Would be honored... :)

\address{$^{\star}$\'{E}cole de Technologie Sup\'{e}rieure, Universit\'{e} du Qu\'{e}bec, H3C 1K3, Montreal, QC, Canada \\

marie-philippe.gill.1@etsmtl.net

hbredin commented 5 years ago

(adding @juanmc2005 to the thread, as last minute contributor)

The paper is now almost in its final version: https://www.overleaf.com/read/vmgxwphkrsnh

Please provide me with feedback (including typos, improvement suggestions, etc.) before October 15th as I plan to submit the paper on October 16th.

Regarding authorship, I plan to put names of all past contributors in ICASSP submission sytem. However, on the paper itself, I will only put the generic mention "pyannote.audio contributors" so that future contributors can still claim authorship (in their Google Scholar profile, for instance).

image

Also, if you have ideas for a better title, I'd be happy to hear about it.

hbredin commented 5 years ago

To get ready for ICASSP paper submission, could you please fill this form (those are the details requested by ICASSP submission system for all authors). Thanks in advance.

hadware commented 5 years ago

Done! Thanks a lot.

As a sidenote: I'll do my part on pyannote once my head is a bit more above water, which is to say, when i'm more relaxed about the upcoming deadline for LREC. (and i doubt what i'll do will influence the quality of the submission in any way so...)

hbredin commented 5 years ago

I just submitted this paper.

hbredin commented 4 years ago

Accepted!

Here are the reviews

---- Comments from the Reviewers: ----
Importance/Relevance: Of sufficient interest
Comment on Importance/Relevance: 
This paper provides analysis of an open source tool available for speaker diarization.  Such a tool can be of interest to researchers with a need for performing speaker diarization or for obtaining a competitive benchmark.

Novelty/Originality: Minor originality
Comment on Novelty/Originality: 
The tool described leverages existing algorithms for the building blocks in the implemented speaker diarization tool.  Granted, the algorithms used represent the state-of-the-art, but are of minor originality.

Technical Correctness: Probably correct

Experimental Validation: Sufficient validation/theoretical paper
Comment on Experimental Validation: 
The experimentation is performed with state-of-the-art datasets and does include results for different configurations, which is good.

Clarity of Presentation: Very clear
Comment on Clarity of Presentation: 
Paper is very clear.

Reference to Prior Work: References adequate

-----
Importance/Relevance: Of limited interest
Comment on Importance/Relevance: 
this paper describes a toolkit for various detection tasks that is required for a speaker diarization pipeline. The toolkit implementation is based on neural networks.

Novelty/Originality: Moderately original
Comment on Novelty/Originality: 
the purpose of the paper is to present the toolkit and highlight some results  that are similar to state of the art  results.

Technical Correctness: Probably correct

Experimental Validation: Limited but convincing

Clarity of Presentation: Clear enough

Reference to Prior Work: References adequate

-----
Importance/Relevance: Of broad interest

Novelty/Originality: Very original

Technical Correctness: Probably correct

Experimental Validation: Sufficient validation/theoretical paper

Clarity of Presentation: Very clear

Reference to Prior Work: References adequate

General Comments to Authors: 
The paper is well written with solid numbers in most of tasks. Can you also comments/details on the RTF(real time factor) of the major steps? This can help readers better understand the pros and cons of the system.
BTW, do not forget to update the number for Table 4.