google / uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
https://arxiv.org/abs/1810.04719
Apache License 2.0
1.55k stars 320 forks source link

add crp_alpha support #70

Closed aluminumbox closed 4 years ago

aluminumbox commented 4 years ago

I've noticed that the training args accepts a given value of crp_alpha, and there were issues about adding support for estimation of crp_alpha.

I've added a script, which accepts the train_sequence and train_cluster_id loaded from './data/toy_training_data.npz', iterate through a searching range, and gives the best crp_alpha value estimated from training data.

The script is pretty simply actually. The steps were as follows:

  1. Iterate through a range of alpha
  2. Iterate through all training samples
  3. Calculate p(y|z) for each sample
  4. Return the alpha with highest p(y|z)

I hope this script will help some people.

googlebot commented 4 years ago

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

aluminumbox commented 4 years ago

@googlebot I signed it!

googlebot commented 4 years ago

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

wq2012 commented 4 years ago

@AnzCol Please take a look to see if it is correct. This is trying to address #4

However this PR is trying to do it via a brute force search.

Also, this PR does not have any docstring or unit test for now.

wq2012 commented 4 years ago

@aluminumbox Thanks for your contribution.

Please see our updated community contribution guidelines here: https://github.com/google/uis-rnn/blob/master/CONTRIBUTING.md

Specifically, please:

  1. Move your submitted file to uisrnn/contrib.
  2. Add your contributor information to the submitted files.
  3. Add unit tests in tests/contrib.
  4. Make sure travis-ci passes.

Thanks.

codecov-io commented 4 years ago

Codecov Report

Merging #70 into master will increase coverage by 0.99%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #70      +/-   ##
==========================================
+ Coverage   90.42%   91.41%   +0.99%     
==========================================
  Files           7        7              
  Lines         449      501      +52     
==========================================
+ Hits          406      458      +52     
  Misses         43       43
Impacted Files Coverage Δ
uisrnn/contrib/range_search_crp_alpha.py 100% <100%> (ø)
uisrnn/__init__.py

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 822e9dd...26d0fbc. Read the comment docs.

wq2012 commented 4 years ago

@aluminumbox Thanks for your new changes. I left some more comments, but I think we are close.

Also, please be aware that if in the future someone wants to change the code you authored, I will add you as reviewer.

aluminumbox commented 4 years ago

@aluminumbox Thanks for your new changes. I left some more comments, but I think we are close.

Also, please be aware that if in the future someone wants to change the code you authored, I will add you as reviewer.

Thanks for the review. It really improved the code readability.