Closed marcobortolami closed 1 year ago
Up to now I updated simulations.py
and noise.py
. The only difference with respect to what discussed in #255, I think, is that I removed the default also in Simulation.add_noise
in addition to the functions in noise.py
. I made this choice for a better consistency among the function/method calls that generate noise: in this way, whenever generating noise, the parameter random=...
must always be specified. Otherwise, only for Simulation.add_noise()
there would be the possibility not to specify a RNG. Feel free to change this:)
Now, these things remain to be done:
random_seed
value in the automatically generated section in the report. In my opinion it's better to keep track of the seed;None
) an integer random_seed
;I can take care of these modifications. If you want, in the meanwhile, comment the previous commits:)
I fixed most of the tests. Only the ones taking parameters from a toml file or from a dictionary are still failing, because random_seed
is a parameter that is needed in the constructor call even if it's present in the toml file or in the dictionary.
One way around is to pass it as None
in the constructor call, so it does not raise an error and it gets it from the toml file/dictionary later, but this may be confusing or shady.
One better solution is giving a default value to random_seed
that is "wrong", like an empty string, and check in the constructor just before init_random
whether random_seed
still has that value, after collecting parameters directly passed to the constructor, parameters in a dictionary or in a toml file. If this is the case, it means that random_seed
has not been passed and a message saying set random_seed to None or int
should appear.
What do you think?
Since the default random number generator of numpy
is actually PCG64 and since SeedSequence
accepts None
, I uniformed and simplified sim.init_random
. I did some tests and verified that:
None
the random numbers are different for two different ranks and for two different runs of the same scriptint
the random numbers are different for two different ranks but the same for two different runs of the same scriptWith the last commit I:
random_seed
instead of self.random_seed
);random_seed
that is ""
and just before init_random
I added a check to ensure that random_seed
is not still ""
. This was needed as, without a default value, it was not possible to pass random_seed
with a parameter file or with parameters
only;Time to pass to the examples in the tutorials and documentation:)
Documentation, benchmarks, tutorials and teaching material has been fixed
I modified test_simulation_random
: now it also checks that by passing an integer number as seed the results are reproducible, instead by passing None
they are not.
For me, this is basically ready. I still have few questions / changes that I list here. Once we answer these, for me we can merge.
litebird_sim
version. This is a breaking change as all the scripts and notebooks instantiating a Simulation object (99% of them?) will fail. I'm not sure how to change the frameworks' version (like 0.10.0 --> 0.11.0). Is it something that is done at the PR level, for example with a commit?random_seed
appears. How can I check the documentation as it appears from a browser before the PR is merged? Just to verify that the size of the pictures is ok.seed
to random_seed
. Although the usual name for a RNG seed is seed
, I made this change as, in my opinion, its usage is more clear. To be more precise, when someone uses a seed is usually when creating a RNG, like rng = np.random.random(seed)
. However, most of the times people will pass this parameter to the constructor of Simulation
, like sim = lbs.Simulation(random_seed=42)
. Here's my point: with random_
it's more clear what that mandatory parameter is used for (random numbers). I'm aware that it's not customary, so if you don't like this I can change this name everywhere (easy task with a grep
, just takes some time). I would also rename sim.random
in sim.rng
or sim.random_number_generator
, but maybe it's too much;)Hi @marcobortolami , awesome work! I checked this PR and everything looks very nice.
Here are a few answers to your questions:
Change litebird_sim version. This is a breaking change as all the scripts and notebooks instantiating a Simulation object (99% of them?) will fail. I'm not sure how to change the frameworks' version (like 0.10.0 --> 0.11.0). Is it something that is done at the PR level, for example with a commit?
No, this is usually tackled by a separate commit on the master
branch. (It's better to keep version bumps separated from PR doing actual work.) We'll release a new version once this PR is merged.
We follow semantic versioning, which means that until we hit version 1.0 we just bump the minor version every time we do a new release, without differentiating between breaking/non-breaking changes.
Check figures that has changed. Two pictures showing examples of reports in the documentation have been changed as now a new line writing the random_seed appears. How can I check the documentation as it appears from a browser before the PR is merged? Just to verify that the size of the pictures is ok.
Just type
make -C docs html
from the main directory. The -C docs
flag tells Make to run within the docs/
directory, and html
is the name of the target. Beware that you will likely need to install a few dependencies, most notably sphinx
. (If you're curious, instead of html
you can specify epub
, latexpdf
, …) Once the command completes, just run
firefox docs/build/html/index.html
(assuming you're using Firefox), and you'll be able to browse a local copy of the documentation. Every time you update the RST files, just re-run the make
command and refresh the page.
Names. When implementing this PR, I changed the name of the seed for the RNG from seed to random_seed. Although the usual name for a RNG seed is seed, I made this change as, in my opinion, its usage is more clear. To be more precise, when someone uses a seed is usually when creating a RNG, like rng = np.random.random(seed). However, most of the times people will pass this parameter to the constructor of Simulation, like sim = lbs.Simulation(randomseed=42). Here's my point: with random it's more clear what that mandatory parameter is used for (random numbers). I'm aware that it's not customary, so if you don't like this I can change this name everywhere (easy task with a grep, just takes some time). I would also rename sim.random in sim.rng or sim.random_number_generator, but maybe it's too much;)
I fully agree, random_seed
is much clearer to read and clearly states the intent of the parameter. I like it!
Thank you for your hard work!
Thanks for the semantic versioning reference:) we'll release a new version after the PR is merged, understood. I checked the documentation, thanks! Ok for the names:) I updated the changelog!
I'll wait for a thumb up to this comment to know when I can merge!
For me it's perfect! You can merge it whenever you want.
Thanks!
This PR has been created to fix the inconsistency of the noise generation pointed out in #255.