RadioAstronomySoftwareGroup / pyuvsim

A ultra-high precision package for simulating radio interferometers in python on compute clusters.
https://pyuvsim.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
43 stars 7 forks source link

Slower pyuvsim with the newest pyuvdata #372

Closed JianrongTan closed 2 years ago

JianrongTan commented 2 years ago

I found pyuvsim somehow ran slower with the newest version of pyuvdata. Here I ran two simulations with different versions of pyuvdata (2.2.1 vs 2.2.4).

For the one with 2.2.1, I have the output

UVData initialization took 0.056 min
Skymodel setup took 0.000 min
Nbls: 1378
Ntimes: 60
Nfreqs: 101
Nsrcs: 15272
Tasks:  8350680.0
1.00% completed. 0:02:44.880095  elapsed. 4:32:02.695038 remaining. 

2.00% completed. 0:05:24.837237  elapsed. 4:25:16.402239 remaining. 

3.00% completed. 0:08:14.631924  elapsed. 4:26:32.532908 remaining. 

4.00% completed. 0:10:55.782791  elapsed. 4:22:18.109679 remaining. 

5.00% completed. 0:13:39.862219  elapsed. 4:19:36.832374 remaining. 

6.00% completed. 0:16:21.782866  elapsed. 4:16:20.441954 remaining. 

7.00% completed. 0:19:10.757582  elapsed. 4:14:47.894025 remaining. 

8.00% completed. 0:22:00.462444  elapsed. 4:13:04.586802 remaining. 

Here we see to finish the whole simulation, we need ~ 4.5 hours.

While the one with 2.2.4

UVData initialization took 0.061 min
Skymodel setup took 0.000 min
Nbls: 1378
Ntimes: 60
Nfreqs: 101
Nsrcs: 15003
Tasks:  8350680.0
1.00% completed. 0:04:32.463676  elapsed. 7:29:33.512418 remaining. 

2.00% completed. 0:08:59.811762  elapsed. 7:20:50.065267 remaining. 

3.00% completed. 0:13:26.699730  elapsed. 7:14:41.831585 remaining. 

4.00% completed. 0:17:58.761335  elapsed. 7:11:28.673494 remaining. 

5.00% completed. 0:22:14.718190  elapsed. 7:02:38.047375 remaining. 

It takes ~7.5 hours!

I am not sure why there is a huge difference between the running time, since the only difference is the version of pyuvdata when I generate the conda environments. I wonder if anyone can help on this.

mkolopanis commented 2 years ago

Naively I would expect the specific version of pyuvdata to not matter for timings since the only thing that happens during the simulation itself is an index into the data_array. But obviously you felt that too or you wouldn't make this issue :laughing:

I think with some more information about the run environment we might be able to help more. Profiling/timing can be tricky. Are you running these on a machine you control? Were there similar background loads both times? Or was this on a cluster with identical configurations? Also can you consistently re-produce this discrepancy between the pyuvdata versions?

JianrongTan commented 2 years ago

I am running both on HERA clusters. The settings are the same

#PBS -q hera
#PBS -l nodes=4:ppn=4
#PBS -l pmem=2gb,pvmem=12gb
mkolopanis commented 2 years ago

A note: Interestingly these are not identical simulations, I see more sources in the second one.

JianrongTan commented 2 years ago

I think they are very close (15272 vs 15003) and I wouldn't think the running time could be differenced by 67%

mkolopanis commented 2 years ago

yes they are ridiculous close, just being picky. Would you mind providing your setup yaml and links to your beam/config/source list?

JianrongTan commented 2 years ago

I tested more simulations with exactly the same parameters for both environments and found I couldn't stably reproduce the issue.

UVData initialization took 0.050 min
Skymodel setup took 0.000 min
Nbls: 1378
Ntimes: 60
Nfreqs: 101
Nsrcs: 15003
Tasks:  8350680.0
1.00% completed. 0:01:50.626471  elapsed. 3:02:31.994091 remaining. 

2.00% completed. 0:03:40.069162  elapsed. 2:59:43.099041 remaining. 

3.00% completed. 0:05:29.836409  elapsed. 2:57:44.289263 remaining.

Here it just takes 3 hours instead of 7.5 hours above!

So I think it may be just a too tricky timing problem. And the running time of most simulations I tested scale correctly. I am almost fine with it now :slightly_smiling_face:. Thanks @mkolopanis for looking into this!

bhazelton commented 2 years ago

@JianrongTan sounds like we should close this issue for now, but if you see simulations slow way down again please let us know!