Open chadlagore opened 6 years ago
Hmm, for some reason the PR build is passing but the actual commit build is failing? 🤔
Yeah. The failure means that the model build is not deterministic, which may not be a problem, but affects they way the tests may be written. I can run that test 100 times and view the output to be sure.
Posting a fix for this today!
~/git/minutes chad/126-prediction-output*
minutes-c2ZPuskd ❯ pytest test/test_minutes.py
=============================================================================================== test session starts ===============================================================================================
platform darwin -- Python 3.6.4, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /Users/chadlagore/git/minutes, inifile:
plugins: cov-2.5.1
collected 3 items
test/test_minutes.py ..F [100%]
==================================================================================================== FAILURES =====================================================================================================
__________________________________________________________________________________________________ test_phrases ___________________________________________________________________________________________________
def test_phrases():
for model_name in Minutes.parents:
minutes = Minutes(parent=model_name)
minutes.add_speaker(c.SPEAKER1)
minutes.add_speaker(c.SPEAKER2)
minutes.fit()
# Predict new phrases (make sure we ony predict once per obs)
conversation = Conversation(c.CONVERSATION_AUDIO, minutes)
raw, _ = conversation.get_observations(**minutes.preprocessing_params)
assert len(conversation.phrases) == len(raw)
print(conversation.phrases)
# Make sure we ony predicted on speaker 1 and 2.
names = [p.speaker.name for p in conversation.phrases]
> assert sorted(list(np.unique(names))) == ['speaker1', 'speaker2']
E AssertionError: assert ['speaker2'] == ['speaker1', 'speaker2']
E At index 0 diff: 'speaker2' != 'speaker1'
E Right contains more items, first extra item: 'speaker2'
E Use -v to get the full diff
test/test_minutes.py:37: AssertionError
---------------------------------------------------------------------------------------------- Captured stdout call -----------------------------------------------------------------------------------------------
[<minutes.conversation.Phrase object at 0x1201b7278>, <minutes.conversation.Phrase object at 0x1201b71d0>, <minutes.conversation.Phrase object at 0x1201b7978>, <minutes.conversation.Phrase object at 0x1201b7a58>, <minutes.conversation.Phrase object at 0x1201b7240>, <minutes.conversation.Phrase object at 0x1201b7940>, <minutes.conversation.Phrase object at 0x1201b7780>, <minutes.conversation.Phrase object at 0x1201b79e8>, <minutes.conversation.Phrase object at 0x1201b7a20>, <minutes.conversation.Phrase object at 0x1201b7898>]
======================================================================================= 1 failed, 2 passed in 7.35 seconds ========================================================================================
~/git/minutes chad/126-prediction-output* 10s
minutes-c2ZPuskd ❯ pytest test/test_minutes.py
=============================================================================================== test session starts ===============================================================================================
platform darwin -- Python 3.6.4, pytest-3.5.1, py-1.5.3, pluggy-0.6.0
rootdir: /Users/chadlagore/git/minutes, inifile:
plugins: cov-2.5.1
collected 3 items
test/test_minutes.py ... [100%]
============================================================================================ 3 passed in 5.44 seconds =============================================================================================
I've replicated this locally. It is as we expected, the test is non-deterministic because we train the model each time!
We provide our model with a random_state
parameter—this gets used by _generate_training_data
to do a test-train split. Ideally, it would lock the Keras model as well, but its not that simple. Keras uses the Numpy seed and the Tensorflow seed, which is set globally.
from numpy.random import seed
from tensorflow import set_random_seed
seed(random_state)
set_random_seed(random_state)
random_state
to generate training data, then let the user set the global Numpy state if they want. This will cause confusion if someone sets the random state of the Minutes
model, and expects some sort of stability in the answer.Minutes
models non-thread safe 🤔.np.random.seed
and the tf.set_random_seed
if they want reproducibility.This seems like the simplest solution for now!
:construction_worker: Changes
:flashlight: Testing Instructions
Follow README instructions.