hpc4cmb / toast

Time Ordered Astrophysics Scalable Tools
Other
44 stars 39 forks source link

Noise simulations will have discontinuities at data distribution boundaries. #7

Closed tskisner closed 6 years ago

tskisner commented 8 years ago

Since current noise sims have each process simulate a piece of timestream independently, there will be discontinuities at the process boundaries. The solution is to either redistribute the data so that one process has the whole timestream, OR have every process simulate the whole dataset up to their local start sample, OR to use a python wrapper around fftw-mpi.

jdborrill commented 8 years ago

This ties back to the other note, I assume.

Aren't we distributing data in chunks of stationary intervals to avoid this problem?

Julian

On Wed, Dec 9, 2015 at 1:54 PM, Theodore Kisner notifications@github.com wrote:

Since current noise sims have each process simulate a piece of timestream independently, there will be discontinuities at the process boundaries. The solution is to either redistribute the data so that one process has the whole timestream, OR have every process simulate the whole dataset up to their local start sample, OR to use a python wrapper around fftw-mpi.

— Reply to this email directly or view it on GitHub https://github.com/tskisner/pytoast/issues/7.

tskisner commented 8 years ago

we can chat more in person, but for planck we use one observation for the whole mission and chop it into tiny pieces. So discontinuities at those boundaries might not be a problem. For other experiments we might have multiple observations, each with a single stationary interval. To simulate a contiguous piece of noise data for each channel in an observation, we have to use all the processes in our communicator effectively. That means either using all processes for each channel in order, with fftw-mpi (for example), or redistributing the data so that one process has only a couple channels and does the whole time span.

On 12/09/2015 04:17 PM, jdborrill wrote:

This ties back to the other note, I assume.

Aren't we distributing data in chunks of stationary intervals to avoid this problem?

Julian

On Wed, Dec 9, 2015 at 1:54 PM, Theodore Kisner notifications@github.com wrote:

Since current noise sims have each process simulate a piece of timestream independently, there will be discontinuities at the process boundaries. The solution is to either redistribute the data so that one process has the whole timestream, OR have every process simulate the whole dataset up to their local start sample, OR to use a python wrapper around fftw-mpi.

— Reply to this email directly or view it on GitHub https://github.com/tskisner/pytoast/issues/7.

— Reply to this email directly or view it on GitHub https://github.com/tskisner/pytoast/issues/7#issuecomment-163444680.

tskisner commented 7 years ago

For short noise simulations, we have worked around this by distributing over detector. Simultaneous distribution by both time and detector was added in 5568fa457af4efcdc649b167db4405040e255f11. The longer-term solution to support very long noise simulations will likely be one of:

tskisner commented 6 years ago

This is no longer an issue in practice. Closing now, and we can open new issues to address specific types of hybrid noise sims in the future.