radiocosmology / draco

A pipeline for the analysis and simulation of drift scan radio data
MIT License
9 stars 7 forks source link

dtype issue in util.random #288

Closed sjforeman closed 2 months ago

sjforeman commented 2 months ago

When I try to run GaussianNoiseDataset with a fresh virtual environment on cedar, I get a strange error (after fixing a different error with https://github.com/radiocosmology/draco/pull/287). Here are some relevant lines from the log (they might be out of order, since they were printed by each of the many threads there were running, but hopefully they convey the rough message):

    24.7s [MPI  0/64] - INFO     draco.core.io.LoadFilesFromParams: Starting next for task LoadFilesFromParams
    24.9s [MPI  0/64] - INFO     draco.core.io.LoadFilesFromParams: Loading file /project/rpp-chime/jrs65/test_processed/stacks/rev_07/all/sstack.h5 [ 1/1]
    24.9s [MPI  0/64] - DEBUG    draco.core.io.LoadFilesFromParams: Reading with selections: {'freq_sel': slice(280, 480, None)}
    49.4s [MPI  0/64] - INFO     draco.core.io.LoadFilesFromParams: Leaving next for task LoadFilesFromParams
    49.4s [MPI  0/64] - DEBUG    caput.pipeline: LoadFilesFromParams.next produced output data product with keys ('sstream',)
    49.4s [MPI  0/64] - DEBUG    caput.pipeline: GaussianNoiseDataset.next stowing data product with key sstream for `in`.
    49.4s [MPI  0/64] - DEBUG    caput.pipeline: Task GaussianNoiseDataset.next calling 'next()'.
    49.4s [MPI  0/64] - INFO     draco.synthesis.noise.GaussianNoiseDataset: Starting next for task GaussianNoiseDataset
    49.6s [MPI  0/64] - INFO     draco.synthesis.noise.GaussianNoiseDataset: Using random seed: 252079471955908021289907459966155907780
Traceback (most recent call last):

  File "/cvmfs/soft.computecanada.ca/easybuild/software/2023/x86-64-v3/Compiler/gcccore/python/3.11.5/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/sforeman/repos/code/draco/draco/util/random.py", line 422, in _fill
    method(gen, *args, **kwargs, out=local_array)

  File "_generator.pyx", line 1117, in numpy.random._generator.Generator.standard_normal

  File "_common.pyx", line 304, in numpy.random._common.double_fill

TypeError: Supplied output array has the wrong type. Expected float64, got float32

The above exception was the direct cause of the following exception:
    method(gen, *args, **kwargs, out=local_array)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/project/rpp-chime/chime/chime_env/modules/chime/python/2024.04/lib/python3.11/site-packages/click/core.py", line 1157, in __call__

    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "_common.pyx", line 280, in numpy.random._common.check_output

    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^

    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/project/rpp-chime/chime/chime_env/modules/chime/python/2024.04/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return ctx.invoke(self.callback, **ctx.params)
    rv = self.invoke(ctx)

  File "/home/sforeman/repos/code/draco/draco/synthesis/noise.py", line 108, in process
    random.complex_normal(

  File "/home/sforeman/repos/code/draco/draco/util/random.py", line 103, in complex_normal
    rng.standard_normal(rsize, dtype=rtype, out=out.view(rtype))

  File "/home/sforeman/repos/code/draco/draco/util/random.py", line 446, in _call
    raise RuntimeError(
RuntimeError: An exception occurred in thread 0 (and maybe others).

Here is what I think is going on: numpy.random.Generator.standard_normal (https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.standard_normal.html) is being called with an out argument, and the dtype of the out argument is np.float32, but the routine expects np.float64. However, in tracing through where this is called in draco, it seems like a dtype=np.float32 is also being supplied, so I'm not sure why the error is occurring.

Tracing through the code:

Some help in finishing this detective work would be much appreciated!

ketiltrout commented 2 months ago

dtype is popped from kwargs here: https://github.com/radiocosmology/draco/blob/65808dd4ad3f627c8c2e6127bce32b218cffe6ac/draco/util/random.py#L385

And then the kwargs, without dtype is passed to the numpy method here: https://github.com/radiocosmology/draco/blob/65808dd4ad3f627c8c2e6127bce32b218cffe6ac/draco/util/random.py#L422

ljgray commented 2 months ago

I've played around with it a bit. It appears that rng.standard_normal does not try to use the dtype of out, so if dtype isn't provided as an argument (which it isn't, as Don points out) it will default to np.float64 regardless of the dtype of out. To fix this, we just need to provide the dtype argument - I'll make a quick PR