CEA-COSMIC / pysap

Python Sparse data Analysis Package
Other
54 stars 24 forks source link

[BUG] Parallel usage of Wavelets results in errors #101

Open chaithyagr opened 5 years ago

chaithyagr commented 5 years ago

System setup OS: [e.g] macOS v10.14.1 Python version: [e.g.] v3.6.7 Python environment (if any): [e.g.] conda v4.5.11

Describe the bug While using Pysap with Parallel in joblib to carry out Wavelet Transforms for various cases, We face issues. The nb_band_per_scale variable is not being populated.

To Reproduce

coeffs, coeffs_shape = \
                zip(*Parallel(n_jobs=self.n_cpu)
                    (delayed(self._op)
                    (data[i], self.transform[i])
                    for i in numpy.arange(self.num_channels)))

with _op function defined as :

def _op(self, data, transform):
        if isinstance(data, numpy.ndarray):
            data = pysap.Image(data=data)
        transform.data = data
        transform.analysis()
        coeffs, coeffs_shape = flatten(transform.analysis_data)
        return coeffs, coeffs_shape

Expected behavior We expect the adjoint operation to work. But we get random errors, especially that nb_band_per_scale is None.

Module and lines involved I see that when n_cpu=1, things work smoothly, the issue is when we extend it to have more cores.

Are you planning to submit a Pull Request?

zaccharieramzi commented 5 years ago

Can you format the code in this issue to make it more readable?

chaithyagr commented 5 years ago

Updated codes

zaccharieramzi commented 5 years ago

Cool, can you also provide a minimal failing example so that we can directly copy-paste and investigate easily (it will also potentially be the base for a future unit test)? Also don't forget to include the error traceback in the issue.

Finally, remember to also format the code when in text.

chaithyagr commented 5 years ago

Well, it is not quite direct, but here is the smallest I could get.

import pysap
from pysap.base.utils import flatten
from pysap.base.utils import unflatten
from joblib import Parallel, delayed
import numpy as np

num_channels = 32
n_cpu = 8
N = 64

def op(data, transform):
    if isinstance(data, np.ndarray):
        data = pysap.Image(data=data)
    transform.data = data
    transform.analysis()
    coeffs, coeffs_shape = flatten(transform.analysis_data)
    return coeffs, coeffs_shape

def adj_op(coeffs, coeffs_shape, transform):
    transform.analysis_data = unflatten(coeffs, coeffs_shape)
    image = transform.synthesis()
    return image.data

transform_klass = pysap.load_transform("db4")
transform = np.asarray([transform_klass(nb_scale=4)  for i in np.arange(num_channels)])

data = (np.random.randn(num_channels, N, N) +
        1j * np.random.randn(num_channels, N, N))

coeffs, coeffs_shape =\
    zip(*Parallel(n_jobs=n_cpu)
    (delayed(op)
     (data[i], transform[i])
     for i in np.arange(num_channels)))
coeffs_shape = np.asarray(coeffs_shape)

image = Parallel(n_jobs=n_cpu)(
    delayed(adj_op)
    (coeffs[i], coeffs_shape[i], transform[i])
    for i in np.arange(num_channels))

Note that the test fails if n_cpu>1 with following traceback : (I did not add this earlier as to me it doesn't make a lot of sense)

    transform.analysis_data = unflatten(coeffs, coeffs_shape)
  File "/home/cg260486/cgr_venv/lib/python3.5/site-packages/python_pySAP-0.0.3-py3.5-linux-x86_64.egg/pysap/base/transform.py", line 264, in _set_analysis_data
    if len(analysis_data) != sum(self.nb_band_per_scale):
TypeError: 'NoneType' object is not iterable

For some reason, when we try to run in parallel the nb_band_per_scale is not initialized.

However the above code works great with n_cpu=1, in which case, we are running sequentially

chaithyagr commented 5 years ago

It looks like this is related with multiple imports of pysap as the backend is loky. Moving backend to threading solves this issue. I dont think there's much left to address here. Closing.

zaccharieramzi commented 5 years ago

I don't think we can close this. Indeed if at some point we want to do multi-processing (and not simply multi-threading) we will need to potentially use other backends.

Can you explain what you mean by the multiple imports of pysap?

chaithyagr commented 5 years ago

Can you explain what you mean by the multiple imports of pysap?

For each process, a new pysap is loaded. Firstly, this adds to a lot of overhead. In my opinion, the initializations and communications across multiple processes are not happening right in multi-process cases. We may have to explore deeper, and this could be an issue in joblib (mostly not), or here in pysap.

I am fine with keeping it open, I just felt that this could at this point mean, just merely more debug.

zaccharieramzi commented 5 years ago

Well yes there should only be an overhead but not the error you were mentioning. Let's keep it open for further investigation.