litebird / litebird_sim

Simulation tools for LiteBIRD
GNU General Public License v3.0
18 stars 13 forks source link

Fix noise seed inconsistency #256

Closed marcobortolami closed 1 year ago

marcobortolami commented 1 year ago

This PR has been created to fix the inconsistency of the noise generation pointed out in #255.

marcobortolami commented 1 year ago

Up to now I updated simulations.py and noise.py. The only difference with respect to what discussed in #255, I think, is that I removed the default also in Simulation.add_noise in addition to the functions in noise.py. I made this choice for a better consistency among the function/method calls that generate noise: in this way, whenever generating noise, the parameter random=... must always be specified. Otherwise, only for Simulation.add_noise() there would be the possibility not to specify a RNG. Feel free to change this:)

marcobortolami commented 1 year ago

Now, these things remain to be done:

I can take care of these modifications. If you want, in the meanwhile, comment the previous commits:)

marcobortolami commented 1 year ago

I fixed most of the tests. Only the ones taking parameters from a toml file or from a dictionary are still failing, because random_seed is a parameter that is needed in the constructor call even if it's present in the toml file or in the dictionary. One way around is to pass it as None in the constructor call, so it does not raise an error and it gets it from the toml file/dictionary later, but this may be confusing or shady. One better solution is giving a default value to random_seed that is "wrong", like an empty string, and check in the constructor just before init_random whether random_seed still has that value, after collecting parameters directly passed to the constructor, parameters in a dictionary or in a toml file. If this is the case, it means that random_seed has not been passed and a message saying set random_seed to None or int should appear. What do you think?

marcobortolami commented 1 year ago

Since the default random number generator of numpy is actually PCG64 and since SeedSequence accepts None, I uniformed and simplified sim.init_random. I did some tests and verified that:

marcobortolami commented 1 year ago

With the last commit I:

Time to pass to the examples in the tutorials and documentation:)

marcobortolami commented 1 year ago

Documentation, benchmarks, tutorials and teaching material has been fixed

marcobortolami commented 1 year ago

I modified test_simulation_random: now it also checks that by passing an integer number as seed the results are reproducible, instead by passing None they are not.

marcobortolami commented 1 year ago

For me, this is basically ready. I still have few questions / changes that I list here. Once we answer these, for me we can merge.

ziotom78 commented 1 year ago

Hi @marcobortolami , awesome work! I checked this PR and everything looks very nice.

Here are a few answers to your questions:

Change litebird_sim version. This is a breaking change as all the scripts and notebooks instantiating a Simulation object (99% of them?) will fail. I'm not sure how to change the frameworks' version (like 0.10.0 --> 0.11.0). Is it something that is done at the PR level, for example with a commit?

No, this is usually tackled by a separate commit on the master branch. (It's better to keep version bumps separated from PR doing actual work.) We'll release a new version once this PR is merged.

We follow semantic versioning, which means that until we hit version 1.0 we just bump the minor version every time we do a new release, without differentiating between breaking/non-breaking changes.

Check figures that has changed. Two pictures showing examples of reports in the documentation have been changed as now a new line writing the random_seed appears. How can I check the documentation as it appears from a browser before the PR is merged? Just to verify that the size of the pictures is ok.

Just type

make -C docs html

from the main directory. The -C docs flag tells Make to run within the docs/ directory, and html is the name of the target. Beware that you will likely need to install a few dependencies, most notably sphinx. (If you're curious, instead of html you can specify epub, latexpdf, …) Once the command completes, just run

firefox docs/build/html/index.html

(assuming you're using Firefox), and you'll be able to browse a local copy of the documentation. Every time you update the RST files, just re-run the make command and refresh the page.

Names. When implementing this PR, I changed the name of the seed for the RNG from seed to random_seed. Although the usual name for a RNG seed is seed, I made this change as, in my opinion, its usage is more clear. To be more precise, when someone uses a seed is usually when creating a RNG, like rng = np.random.random(seed). However, most of the times people will pass this parameter to the constructor of Simulation, like sim = lbs.Simulation(randomseed=42). Here's my point: with random it's more clear what that mandatory parameter is used for (random numbers). I'm aware that it's not customary, so if you don't like this I can change this name everywhere (easy task with a grep, just takes some time). I would also rename sim.random in sim.rng or sim.random_number_generator, but maybe it's too much;)

I fully agree, random_seed is much clearer to read and clearly states the intent of the parameter. I like it!

Thank you for your hard work!

marcobortolami commented 1 year ago

Thanks for the semantic versioning reference:) we'll release a new version after the PR is merged, understood. I checked the documentation, thanks! Ok for the names:) I updated the changelog!

I'll wait for a thumb up to this comment to know when I can merge!

ziotom78 commented 1 year ago

For me it's perfect! You can merge it whenever you want.

Thanks!