RadioAstronomySoftwareGroup / pyuvsim

A ultra-high precision package for simulating radio interferometers in python on compute clusters.
https://pyuvsim.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
43 stars 7 forks source link

Fix uvdata ordering #371

Closed mkolopanis closed 2 years ago

mkolopanis commented 3 years ago

Forces time-baseline ordering before a simulation.

Motivation and Context

time-baseline ordering was assumed in the task index creation, this way we can enforce it. Attempts to re-order back if the ordering was well defined before the simulation. closes #370

Types of changes

Checklist:

For all pull requests:

Bug fix checklist:

mkolopanis commented 3 years ago

@bhazelton @jpober I mentioned this in slack but wanted to get it tagged to the PR too.

This ordering issue is bringing up something interesting too. Comparing with a reference simulation file is part of our unit tests. Which should be fine, but if you move through this output you'll notice the reference uvdata object is ordered in (time, antenna1) format. The reference is on the RHS of this comparison.

The "problem" is line with this line on the task index making blti = bl_i + time_i * Nbls # baseline is the fast axis it's assuming that the blt index is time, baseline ordering.

The task does then go lookup what the exact baseline is with this index and computes it, so I think that the output data shouldn't matter right? Just happens to be in a different order than what it thinks. I guess this is interesting because I actually cannot tell if there is an issue at the heart of this or not.

pytest output ```bash (py38_sim) matthew@bahamut [9:23:18] [~/src/pyuvsim] [fix_uvdata_ordering *] -> % pytest -x --ff ========================================== test session starts ========================================== platform linux -- Python 3.8.8, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 rootdir: /home/matthew/work/src/pyuvsim, configfile: setup.cfg plugins: cov-2.11.1 collected 168 items run-last-failure: rerun previous 1 failure first ​ pyuvsim/tests/test_run.py F ​ =============================================== FAILURES ================================================ ________________________ test_run_paramfile_uvsim[param_1time_1src_testcat.yaml] ________________________ ​ — — — — — — — — — — — — — — — — — — — — — — MPI Rank = 0 — — — — — — — — — — — — — — — — — — — — — — — ​ goto_tempdir = '/tmp/pytest-of-matthew/pytest-18/test_run_paramfile_uvsim_param0' paramfile = 'param_1time_1src_testcat.yaml' ​ @pytest.mark.parametrize('paramfile', ['param_1time_1src_testcat.yaml', 'param_1time_1src_testvot.yaml']) @pytest.mark.parallel(2) def test_run_paramfile_uvsim(goto_tempdir, paramfile): # Test vot and txt catalogs for parameter simulation # Compare to reference files. uv_ref = UVData() uv_ref.read_uvfits(os.path.join(SIM_DATA_PATH, 'testfile_singlesource.uvfits')) uv_ref.unphase_to_drift(use_ant_pos=True) # set the x_orientation uv_ref.x_orientation = "east" param_filename = os.path.join(SIM_DATA_PATH, 'test_config', paramfile) # This test obsparam file has "single_source.txt" as its catalog. pyuvsim.uvsim.run_uvsim(param_filename) # Loading the file and comparing is only done on rank 0. if pyuvsim.mpi.rank != 0: return path = goto_tempdir ofilepath = os.path.join(path, 'tempfile.uvfits') uv_new = UVData() with uvtest.check_warnings(UserWarning, match='antenna_diameters is not set'): uv_new.read_uvfits(ofilepath) uv_new.unphase_to_drift(use_ant_pos=True) # Reset parts that will deviate uv_new.history = uv_ref.history uv_new.object_name = uv_ref.object_name # remove filename attribute to ensure equality uv_new.filename = None uv_ref.filename = None > assert uv_new == uv_ref E assert == E Use -v to get the full diff ​ /home/matthew/work/src/pyuvsim/pyuvsim/tests/test_run.py:75: AssertionError ----------------------------------------------- MPI Info ------------------------------------------------ Run with 2 processes. Failed with message: AssertionError: Parallelized test test_run_paramfile_uvsim[param_1time_1src_testcat.yaml] failed. ----------------------------------------- Captured stdout call ------------------------------------------ UVData initialization took 0.001 min Skymodel setup took 0.000 min Nbls: 10 Ntimes: 1 Nfreqs: 1 Nsrcs: 2 Tasks: 10.0 10.00% completed. 0:00:00.060977 elapsed. 0:00:00.548789 remaining. ​ 20.00% completed. 0:00:00.066826 elapsed. 0:00:00.267305 remaining. ​ 30.00% completed. 0:00:00.068160 elapsed. 0:00:00.159040 remaining. ​ 40.00% completed. 0:00:00.072642 elapsed. 0:00:00.108963 remaining. ​ 50.00% completed. 0:00:00.074091 elapsed. 0:00:00.074091 remaining. ​ 60.00% completed. 0:00:00.074175 elapsed. 0:00:00.049450 remaining. ​ 70.00% completed. 0:00:00.114533 elapsed. 0:00:00.049086 remaining. ​ 80.00% completed. 0:00:00.115941 elapsed. 0:00:00.028985 remaining. ​ 90.00% completed. 0:00:00.117237 elapsed. 0:00:00.013026 remaining. ​ 100.00% completed. 0:00:00.118417 elapsed. 0:00:00 remaining. ​ Calculations Complete. Run uvdata uvsim took 0.004 min Outfile path: ./tempfile.uvfits The data are in drift mode and do not have a defined phase center. Phasing to zenith of the first timestamp. ant_1_array parameter value is array, values are not close parameter _ant_1_array does not match. Left is [0 1 1 2 2 2 3 3 3 3], right is [0 1 2 3 1 2 3 2 3 3]. ant_2_array parameter value is array, values are not close parameter _ant_2_array does not match. Left is [0 0 1 0 1 2 0 1 2 3], right is [0 0 0 0 1 1 1 2 2 3]. baseline_array parameter value is array, values are not close parameter _baseline_array does not match. Left is [67585 69633 69634 71681 71682 71683 73729 73730 73731 73732], right is [67585 69633 71681 73729 69634 71682 73730 71683 73731 73732]. data_array parameter value is array, values are not close parameter _data_array does not match. Left is [[[[ 5.4355542e-04+0.0000000e+00j 1.1365576e-04+0.0000000e+00j 2.0422145e-04-1.4448342e-05j 2.0422145e-04+1.4448342e-05j]]] ​ ​ [[[-1.6006181e-02+3.0242475e-03j -1.9316647e-03-2.8275780e-03j -6.5693995e-03+1.3943985e-03j 2.3462791e-03+9.6373906e-04j]]] ​ ​ [[[ 5.0000000e-01+0.0000000e+00j 5.0000000e-01+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 3.2403736e-05+5.2813739e-05j -7.4395084e-06+1.0692345e-05j 1.3843940e-05+2.1469103e-05j 2.5237992e-07-9.6451031e-06j]]] ​ ​ [[[-6.7636435e-04-1.7775858e-03j -6.7636435e-04-1.7775858e-03j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j]]] ​ ​ [[[ 7.2345611e-06+0.0000000e+00j 7.2345611e-06+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]]], right is [[[[ 5.4355542e-04+0.0000000e+00j 1.1365576e-04+0.0000000e+00j 2.0422145e-04-1.4448342e-05j 2.0422145e-04+1.4448342e-05j]]] ​ ​ [[[-1.6006181e-02+3.0242472e-03j -1.9316645e-03-2.8275782e-03j -6.5693995e-03+1.3943984e-03j 2.3462791e-03+9.6373918e-04j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 3.2403615e-05+5.2813812e-05j -7.4395334e-06+1.0692330e-05j 1.3843892e-05+2.1469134e-05j 2.5240138e-07-9.6451022e-06j]]] ​ ​ [[[ 5.0000000e-01+0.0000000e+00j 5.0000000e-01+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[-6.7637756e-04-1.7775811e-03j -6.7637756e-04-1.7775811e-03j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]] ​ ​ [[[ 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j 0.0000000e+00-0.0000000e+00j]]] ​ ​ [[[ 7.2345615e-06+0.0000000e+00j 7.2345615e-06+0.0000000e+00j 0.0000000e+00+0.0000000e+00j 0.0000000e+00+0.0000000e+00j]]]]. uvw_array parameter value is array, values are not close parameter _uvw_array does not match. Left is [[ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [-2.80000000e+01 -8.06308353e-11 -1.35682576e-10] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [-8.20110000e+00 -1.97989000e+01 9.04459378e-11] [ 1.97989000e+01 -1.97989000e+01 2.26128514e-10] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [-8.20110000e+00 1.24192415e-11 1.97989000e+01] [ 1.97989000e+01 9.30500768e-11 1.97989000e+01] [ 1.68975944e-11 1.97989000e+01 1.97989000e+01] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00]], right is [[ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [-2.80000000e+01 -8.06303913e-11 -1.35681688e-10] [-8.20110000e+00 -1.97989000e+01 9.04459378e-11] [-8.20110000e+00 1.24192415e-11 1.97989000e+01] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 1.97989000e+01 -1.97989000e+01 2.26127626e-10] [ 1.97989000e+01 9.30496327e-11 1.97989000e+01] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 1.68980385e-11 1.97989000e+01 1.97989000e+01] [ 0.00000000e+00 0.00000000e+00 0.00000000e+00]]. blt_order is None on right, but not left parameter _blt_order does not match. Left is ('time', 'baseline'), right is None. ======================================== short test summary info ======================================== FAILED pyuvsim/tests/test_run.py::test_run_paramfile_uvsim[param_1time_1src_testcat.yaml] - assert
mkolopanis commented 3 years ago

I think I'm coming to decide there is no actual issue with the simulation since we look up the baseline for each index, just the data was not in the same order it the code thought it might be in.

Something else has come up though, in order to be able to call reorder_blts I needed to have metadata only uvdata objects on all the non rank-0 nodes. I changed the init_uvdata_from_? functions to return metadata only objects. This does add a memory overhead for each node though now.

codecov[bot] commented 3 years ago

Codecov Report

Merging #371 (26fc3e8) into main (1ef8cdb) will increase coverage by 0.00%. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #371   +/-   ##
=======================================
  Coverage   99.08%   99.09%           
=======================================
  Files          13       13           
  Lines        1963     1980   +17     
=======================================
+ Hits         1945     1962   +17     
  Misses         18       18           
Impacted Files Coverage Δ
pyuvsim/simsetup.py 99.88% <100.00%> (+<0.01%) :arrow_up:
pyuvsim/uvsim.py 98.53% <100.00%> (+0.07%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1ef8cdb...26fc3e8. Read the comment docs.

bhazelton commented 3 years ago

Ok, if the data are correct then I'm not fussed about the ordering. But I thought @steven-murray had seen instances when the data were not correct.

mkolopanis commented 3 years ago

Oh maybe I misread it, but then if they are incorrect that's an issue :thinking: I would think they would be correct now either way since we're forcing the time, baselines ordering right?

steven-murray commented 3 years ago

What I found in the hera_sim wrapper was that for certain input orderings, the output was not correct. Internally, the simulation might be correct, but when you do eg. uvdata.get_data((i,j,'xx')) it doesn't match what you know to be true for that baseline (unless the input was ordered by time).

mkolopanis commented 3 years ago

okay so maybe I know what was happening. it was ~partially~ probably assigning data to the wrong baseline before. Inside the task creation this happens. where tasks_shape = (Nfreqs, Ntimes, Nbls) this is where it assumed (time, baseline) ordering. https://github.com/RadioAstronomySoftwareGroup/pyuvsim/blob/1ef8cdbfd98fc6f49b598f75d759852407601ebf/pyuvsim/uvsim.py#L318-L338

then as it iterates over the blti index it is assuming that bl_i associated with the baseline is moving fastest to the next baseline, when in reality it might not be. So it would get to a different bl_i and look up the ant_1 and ant_2 for the blti = bl_i * time_i * Ntimes which actually might not be associated with with bl_i because we're assuming this index moves the fastest when really it might not be.

but as I write this I'm not sure it is making sense.

bhazelton commented 3 years ago

I think that makes sense. I guess the question is: do we want to just force that ordering or do we want to fix the logic to be more sensible. I guess I'm worried that if a uvdata object doesn't have all the times for all the baselines this might break as well, even if we force the ordering. Maybe we could use some of the newer uvdata functionality to make this smarter? Minimally we should force the ordering and error if this assumption breaks (i.e. if Nblts != Nbls * Ntimes).

mkolopanis commented 3 years ago

I feel like I'd prefer if we just fix up the logic in the baseline time creation to be order agnostic but still would prefer to have the input_uv objects be metadata only (mostly because it makes me feel better inside).

bhazelton commented 3 years ago

I agree with both of those!

mkolopanis commented 3 years ago

@steven-murray any thoughts?

steven-murray commented 3 years ago

I'm not sure I'm qualified to have thoughts here, but I do have a couple.

  1. We just found with VisCPU that ensuring ordering is correct after the fact can be relatively time consuming (for a large number of times/baselines). We may not have it written the optimal way, but this should be kept in mind. Probably, this means that we should force the ordering to be something to begin with, and assume that order within the rest of the functions, and only reorder at the end if the initial input order had to be changed to match these assumptions.
  2. This would all be much easier if the uvdata object had an attribute ordering or something which just specified which order the object was in to begin with. Then a uvdata.reorder_blts could be short-circuited if it already had the right order, saving quite a bit of time.

Not sure if they are useful opinions, but there you go.

bhazelton commented 3 years ago

So it turns out there is an attribute, blt_order that records this, but it's optional and only set if the pyuvdata method is called. So you can't count on it being there unless you call the method yourself.

steven-murray commented 3 years ago

@bhazelton that makes sense, because someone could just have data in a file in any old order and read it in. But, we could ensure that when a file is created with the pyuvsim simsetup, it sets that attribute. At least then users have a way of doing a faster simulation if they follow the guidelines.

mkolopanis commented 3 years ago

right as it stands I don't even think pyuvsim is generating uvdata objects in the "optimal" configuration. I think it is doing (time, ant1) not (time, baseline) but we should be able to verify/change that. And like bryna and I would like make the simulation agnostic to the input blt ordering anyway

mkolopanis commented 3 years ago

we can just solve this by forcing blt_order to be required :tada: but those pyuvdata devs are so hard to get in contact with

steven-murray commented 3 years ago

I guess what I'm saying is that making the simulation itself agnostic to the ordering may actually incur non-negligible (though not alarming) slow downs. It's usually possible to be faster when you know the ordering of your data and can iterate through it in the optimal way, rather than having to determine which indices you need to place the simulated data in. Thus I would advocate for the simulator wrapper to force the ordering before going into the detailed simulator. Then that forced ordering could be skipped if it was already the right order.

mkolopanis commented 3 years ago

This is basically what I have implemented in the current scheme. :thinking: we could allow arbitrary ordering and warn that a slow down may occur if not (time, baseline). You know me, can never just agree with you.

mkolopanis commented 3 years ago

also i think these lines confirm we use (time, ant1) ordering. We make (i,j) pairs where j is faster moving, then tile Ntimes times, so time moves slower than ant1. https://github.com/RadioAstronomySoftwareGroup/pyuvsim/blob/1ef8cdbfd98fc6f49b598f75d759852407601ebf/pyuvsim/simsetup.py#L1385-L1393

steven-murray commented 3 years ago

that looks right, so just also need to set blt_ordering when you create the object as well, so you know it's that ordering later. Then pyuvsim should have something like

if not getattr(input_uv, 'blt_order', None) == (bl, time):
    original_order = get_order(input_uv)
    input_uv.reorder_blts('baseline', 'time')  # or however you do this

output_uv = simulate(input_uv)

if original_order != (bl, time):
    output_uv.reorder_blts(original_order)

I guess the thing is that we don't have a function like get_order(), right?

mkolopanis commented 3 years ago

Yeah your latest suggestion is how the code behaves right now.

steven-murray commented 3 years ago

Then I accept.

mkolopanis commented 3 years ago

okay, I'm digging into something else here now so let's not push this through yet.

mkolopanis commented 3 years ago

Okay I think it is ready for a look now. I do have a question now though. In this implementation, tasks are unraveled as (nblts, Nfreqs) so for each blt tasks will iterate over all the frequencies. Is this what we want or should it go (nfreqs, nblts) so for each frequency we iterate over blts?

mkolopanis commented 3 years ago

I suppose it would be worthwhile to have a test now where nblts != nbls * ntimes. nominally it should be able to handle it.

bhazelton commented 3 years ago

@aelanman do you have a sense for whether it matters which changes fastest, the frequency axis or the blt axis?

aelanman commented 3 years ago

@bhazelton Ideally, the time axis should be the slowest. Within the task loop, the local coherency, Jones matrices, and source positions are updated whenever the time changes. The local coherency and source positions are reused otherwise. If time becomes the fastest axis, this advantage is lost. (See the function UVEngine.set_task)

mkolopanis commented 3 years ago

Yeah that makes sense. So unraveling as (nblts, nfreqs) would still allow time to be the slowest for us since this implementation forces (time, baseline) ordering. It basically switches the frequency and baseline axis for an array where nblts = ntimes * nbls, but also allows for the non-square case.

mkolopanis commented 2 years ago

Maybe? You would get Nbtlf float64s for time and freq, and Nbltf int64 for baseline and order. So 256 x Nbltf bits of information. I think that is 1GiB of ram per 32768 tasks. Right would have to switch them in the sort. But the idea is the same.

On Tue, Nov 2, 2021, 09:08 Adam Lanman @.***> wrote:

@.**** commented on this pull request.

In pyuvsim/uvsim.py https://github.com/RadioAstronomySoftwareGroup/pyuvsim/pull/371#discussion_r741248509 :

  • tasks_shape = (Ntimes, Nfreqs, Nbls)
  • time_ax, freq_ax, bl_ax = range(3)
  • tasks_shape = (Nblts, Nfreqs)

There's potentially a lot of memory overhead in the first few lines too, right? Also, the baseline axis should be fastest, not freq.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RadioAstronomySoftwareGroup/pyuvsim/pull/371#discussion_r741248509, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACB7BLOVWIFFEXXP2MIO2YLUKALHTANCNFSM5GVW6RWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

aelanman commented 2 years ago

If it's happening in the uvdata_to_task_iter function, those arrays will be generated on each MPI process.

mkolopanis commented 2 years ago

Yeah I understand this.

On Tue, Nov 2, 2021, 10:50 Adam Lanman @.***> wrote:

If it's happening in the uvdata_to_task_iter function, those arrays will be generated on each MPI process.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RadioAstronomySoftwareGroup/pyuvsim/pull/371#issuecomment-957987572, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACB7BLPA6N33DWXPV437OCTUKAXF7ANCNFSM5GVW6RWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

aelanman commented 2 years ago

So wouldn't that be 1 GiB Npus_per_node of memory used on each node? Once it downselects, it'll be less, but up front it will make an array of the size Nblts Nfreqs on each process.

mkolopanis commented 2 years ago

Yes, in the current prototype implementation that will happen. There may be more sly ways to do it too. I was interested to see how the idea would be received but it seems like we've begun focusing on details instead. I'll post something else once I've found a well optimized solution.

On Tue, Nov 2, 2021, 11:45 Adam Lanman @.***> wrote:

So wouldn't that be 1 GiB Npus_per_node of memory used on each node? Once it downselects, it'll be less, but up front it will make an array of the size Nblts Nfreqs on each process.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RadioAstronomySoftwareGroup/pyuvsim/pull/371#issuecomment-958040256, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACB7BLLY3SPGG2RSQQIKRLDUKA5WPANCNFSM5GVW6RWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

mkolopanis commented 2 years ago

Looks like I was off by an order of magnitude. it is 1MiB per 30k tasks. had a factor of 10 problem.

aelanman commented 2 years ago

Your Ntasks is for 32 baselines?

mkolopanis commented 2 years ago

I'm not sure I understand the question. I had divided 2^20 by my estimation of the size per task instead of 2^30.

aelanman commented 2 years ago

I don't understand where "32768" tasks came from earlier. If that's Nblts * Nfreqs, then for 1024 freqs that's Nblts = 32.

mkolopanis commented 2 years ago

This is what I did. tasks to make 1 GiB of ram = 1 GiB / ( number of arrays * element size of array)

or what I plugged into a python terminal for 4 arrays with 64 bit elements.

tasks_for_1GIB = 2 ** 30 / ((4 * 64) / 8)

and the problem I had the first time around was i used 2 ** 20 in the numerator, so I reported the number of tasks to make 1 MiB ( the ~30k number)

mkolopanis commented 2 years ago

I've pushed up a new version with views, broadcasting, iterators, and slices where possible. I've empirically seen a 1e-5 to 3e-5 MiB increase ram usage per local_tasks per PU.

aelanman commented 2 years ago

Excellent! This looks great. Thanks for figuring that out!

bhazelton commented 2 years ago

@steven-murray Do you understand the herasim test failures on this PR? Also, I think it's getting close to completion if you want to have another look.

steven-murray commented 2 years ago

@bhazelton I don't think the breaking tests have anything to do with pyuvsim per se, and they will likely be fixed soon with a major PR in hera_sim. I think you can merge without those tests passing.

mkolopanis commented 2 years ago
Here's an update on the benchmark from main and from the current version of this branch. Should I just add it to the base file even though it is untracked? Date/Time uvsim_version HostName SettingsFile cpus-per-task Ntasks Nnodes MemLimit Ntimes Nbls Nfreqs Nsrcs Nsrcs_part MaxRSS [GiB] Runtime
2020-03-06-21:40:47.584612 1.1.2. node474.oscar.ccv.brown.edu settings.yaml 2 10 1 50G 21 500 256 3072 1 13.071 4:32:16.373965
2020-03-26-17:54:43.959585 1.1.2. node474.oscar.ccv.brown.edu settings.yaml 2 10 1 50G 21 500 256 3072 1 12.187 3:24:00.635442
2020-04-02-15:50:16.314578 1.1.2. node473.oscar.ccv.brown.edu settings.yaml 2 10 1 50G 21 500 256 3072 1 11.475 0:10:30.647003
2021-11-08-13:29:18.779374 1.2.2.dev1+g23f68a4.main herapost-master.aoc.nrao.edu settings.yaml 2 10 1 50G 21 500 256 3072 1 2.053 0:11:56.878661
2021-11-08-11:38:08.304418 1.2.2.dev25+gef5c800.fix_uvdata_ordering herapost-master.aoc.nrao.edu settings.yaml 2 10 1 50G 21 500 256 3072 1 2.061 0:11:46.983807
mkolopanis commented 2 years ago

I'm also running reference sim 2.1 on the NRAO cluster to check timings. I'll post an update when I have one.

mkolopanis commented 2 years ago

with an N of 1, reference sim 2.1 execution time on the NRAO cluster differed by 6 minutes (longer on this branch) out of 516 minutes (8.6 hours). I don't think that's really an appreciable difference.

steven-murray commented 2 years ago

Really, that's better. Gives you time to make coffee.