Which generates a different seed at runtime when no default seed is provided.
I think I understand the rationality, it tries to achieve two things.
1) The results of the generator should be the same after saving it and loading (@samuelgarcia, thanks for adding tests for this!).
2) Every call to a generator should generate new traces if the seed is not specified. Let's call this variability by default.
I don't agree completly with 2. When less experienced users utilize let's stay generate_ground_truth_recording with default parameters I want the function to have deterministic behavior. This is better for debugging and error finding as when I run the function on my system it wiill be as similar as possible as their run. With the current behavior given by ensure_seed I will not be able to reproduce their exact results unless I have their specific seed that was generated at runtime. This can be extracted but it will be difficult for new users and it will create one more round of questions and requests when discussing an issue. A more severe version of this problem is presented if users find an error when they run generate_ground_truth_recording (because the specific seed was problematic), but then they run the function again and the error dissappear. This undermines their trust in their algorithm (they just know something is wrong but can't reproduce) and makes it hard for them to report the problem to us.
I think that @samuelgarcia made this decision from a power user point of views as this is more convenient and it makes sense semantically: NoiseGeneratorRecording should generate different noise everytime you call it (just like np.random.rand). But I think error reporting and debugging is an imporant use case of these functions which I think is better served by having a specific seed by default. This is better for less experienced users. Power users that want to actually generate variability are in a better position to know what a seed is, change it themselves and roll a function like ensure_seed by themselves if needed.
Two other minor advantages of having a seed by default instead of generating at runtime:
1) It is better for provenance, kwargs saves exactly what the user inputed instead of a value generated on the fly.
2) Less code in our library and easier to document the behavior.
I haved changed my view somehow and I think that most of the functions that interact with the user should actually have the behavior that @samuelgarcia implemented.
I did not have time to participate in the review of https://github.com/SpikeInterface/spikeinterface/pull/1948.
We added the following auxiliary function to generate:
https://github.com/SpikeInterface/spikeinterface/blob/3359647eb7a75a4f258590a46b3f665b7aac9d1d/src/spikeinterface/core/generate.py#L16-L24
Which generates a different seed at runtime when no default seed is provided.
I think I understand the rationality, it tries to achieve two things. 1) The results of the generator should be the same after saving it and loading (@samuelgarcia, thanks for adding tests for this!). 2) Every call to a generator should generate new traces if the seed is not specified. Let's call this variability by default.
I don't agree completly with 2. When less experienced users utilize let's stay
generate_ground_truth_recording
with default parameters I want the function to have deterministic behavior. This is better for debugging and error finding as when I run the function on my system it wiill be as similar as possible as their run. With the current behavior given byensure_seed
I will not be able to reproduce their exact results unless I have their specific seed that was generated at runtime. This can be extracted but it will be difficult for new users and it will create one more round of questions and requests when discussing an issue. A more severe version of this problem is presented if users find an error when they rungenerate_ground_truth_recording
(because the specific seed was problematic), but then they run the function again and the error dissappear. This undermines their trust in their algorithm (they just know something is wrong but can't reproduce) and makes it hard for them to report the problem to us.I think that @samuelgarcia made this decision from a power user point of views as this is more convenient and it makes sense semantically:
NoiseGeneratorRecording
should generate different noise everytime you call it (just likenp.random.rand
). But I think error reporting and debugging is an imporant use case of these functions which I think is better served by having a specific seed by default. This is better for less experienced users. Power users that want to actually generate variability are in a better position to know what a seed is, change it themselves and roll a function likeensure_seed
by themselves if needed.Two other minor advantages of having a seed by default instead of generating at runtime: 1) It is better for provenance,
kwargs
saves exactly what the user inputed instead of a value generated on the fly. 2) Less code in our library and easier to document the behavior.