Open giuspugl opened 1 year ago
Hi @giuspugl,
it seems that so far we cannot produce serially a simulation with n_blocks_time>1 observations and few detectors .
I think that when we do not use MPI, so we run serially, we should use n_blocks_time=1
and n_blocks_det=1
.
i think for the sake of making the framework user friendly we should allow the user to simulate more than one observation serially
Do you mean that we should add this to the notebook and show people how to do that or do you mean that there is not such a possibility in LB sim? Maybe the code blocks below show what you are looking for (cell numbers refer to the ones here).
Cell 6: create two observations
# creating TWO observations
sim.create_observations(
num_of_obs_per_detector=2,
detectors=dets,
n_blocks_det=1,
n_blocks_time=1, # blocks different from one if parallelizing
)
Cell 10: select which observation to fill using scan_map_in_observations
instead of fill_tods
(*)
lbs.scan_map_in_observations(sim.observations[0],
maps,
)
Cell 11: verification that only the first simulation's TODs were filled
times_min = (sim.observations[0].get_times()-sim.observations[0].start_time.cxcsec)/60.
plt.figure(figsize=(14, 8))
plt.plot(times_min,sim.observations[0].tod[0])
plt.plot(times_min,sim.observations[0].tod[1])
plt.plot(times_min,sim.observations[0].tod[2])
plt.plot(times_min,sim.observations[0].tod[3])
times_min = (sim.observations[1].get_times()-sim.observations[1].start_time.cxcsec)/60.
plt.figure(figsize=(14, 8))
plt.plot(times_min,sim.observations[1].tod[0])
plt.plot(times_min,sim.observations[1].tod[1])
plt.plot(times_min,sim.observations[1].tod[2])
plt.plot(times_min,sim.observations[1].tod[3])
Then, you can add other components also to specific tod fields, e.g.
sim.observations[1].tod_dip = np.zeros_like(sim.observations[1].tod)
orbit = lbs.SpacecraftOrbit(sim.observations[1].start_time)
#spacecraft position and velocity
pos_vel = lbs.spacecraft_pos_and_vel(orbit,
sim.observations[1],
delta_time_s=86400.0)
#add dipole to obs_multitod; dipole type is TOTAL_FROM_LIN_T, read the doc for more info
lbs.add_dipole_to_observations(obs=sim.observations[1],
pos_and_vel=pos_vel,
dipole_type=lbs.DipoleType.TOTAL_FROM_LIN_T,
component='tod_dip')
print(sim.observations[1].tod)
print(sim.observations[1].tod_dip)
OUT:
[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
[[ 0.0014382 0.00143838 0.00143857 ... -0.00277029 -0.00276977
-0.00276924]
[ 0.00121285 0.00121308 0.00121331 ... -0.00284975 -0.00284929
-0.00284883]
[ 0.00118937 0.00118961 0.00118986 ... -0.00288474 -0.00288429
-0.00288384]
[ 0.00116545 0.00116571 0.00116597 ... -0.00291877 -0.00291833
-0.0029179 ]]
Please let me know if you were looking for something like this and if you would like to add a similar example to the notebook. I run the code on my pc without using MPI.
(*) by using fill_tods
both the observations' TODs are filled, it seems in a different way... I'm not sure about what is happening...
Thanks a lot @marcobortolami ! this is what i really needed to clarify! no need to update the notebooks !
Hi @giuspugl.
We chatted a bit about this and decided to leave the issue open until we read and maybe improve the documentation regarding the topics connected to it. I think we should have a look here and here (so, also here and here(*)).
As far as I understood, if you set n_blocks_det
and n_blocks_time
to 1, all the TODs will be processed by 1 MPI rank and you go for a serial application. If you set n_blocks_det
and/or n_blocks_time
to a value larger than 1, you divide the detectors and/or samples by the number that you set, having a grid where the different parts of the grid are handled by different tasks, e.g.:
num_of_obs_per_detector=1
n_blocks_det=2
n_blocks_time=3
OBS 1 (all samples)
time -->
________________________
d | | | |
e | rank0 | rank1 | rank2 |
t |_______|_______|_______|
| | | | |
| | rank3 | rank4 | rank5 |
V |_______|_______|_______|
If you do this while having num_of_obs_per_detector=1
, you have 1 observation only taking care of all the detectors and samples. However, if you set num_of_obs_per_detector
to a value larger than 1, you will have more than 1 observation, each one taking care of num_of_samples / num_of_obs_per_detector
samples, right? And each observation will have tha same grid structure, right? e.g.:
num_of_obs_per_detector=2
n_blocks_det=2
n_blocks_time=3
OBS 1 (first half of samples)
time -->
________________________
d | | | |
e | rank0 | rank1 | rank2 |
t |_______|_______|_______|
| | | | |
| | rank3 | rank4 | rank5 |
V |_______|_______|_______|
OBS 2 (second half of samples)
time -->
________________________
d | | | |
e | rank6 | rank7 | rank8 |
t |_______|_______|_______|
| | | | |
| | rank9 | rank10| rank11|
V |_______|_______|_______|
Is all of this correct? In particular the rank distribution in the last example?
(*) I think the upper right part of the first picture is wrong as it should display n_blocks_det=1
and n_blocks_time=2
, right?
Hi @marcobortolami , thanks for the detailed posting! Let's not forget the documentation here: https://litebird-sim.readthedocs.io/en/master/observations.html
The following plot is used in the documentation to explain the meaning of a few parameters:
I believe that a useful addition to the code would be to include a schematic diagram of the way the TOD has been distributed among the MPI processes. It's one of my oldest dreams!
Hi Maurizio. Of course, that is the plot I was referring to with the star :)
(*) I think the upper right part of the first picture is wrong as it should display n_blocks_det=1 and n_blocks_time=2, right?
The thing I am unsure about is the rank distribution when num_of_obs_per_detector
is larger than 1. Should you double the number of ranks as each observation, taking care of half of the samples, has its own ranks distributed like in the picture above? (see my previous comment)
I believe that a useful addition to the code would be to include a schematic diagram of the way the TOD has been distributed among the MPI processes. It's one of my oldest dreams!
Yes, I think a schematic diagram abuot TOD distribution would be a nice addition as parallel applications will be necessary for producing simulations involving an adequate number of detectors. Do you mean improving the documentation or also adding that to the code? Hasn't this alredy been added here?
Hello, it seems that so far we cannot produce serially a simulation with
n_blocks_time>1
observations and few detectors . Please have a look at cell 5 and 6 of https://github.com/litebird/litebird_sim/blob/master/notebooks/litebird_sim_quick_simulation.ipynbi think for the sake of making the framework user friendly we should allow the user to simulate more than one observation serially and leave the parallelization as an user keyword that can be set for production runs.
what do you think ?