mhostetter / galois

A performant NumPy extension for Galois fields and their applications
https://mhostetter.github.io/galois/
MIT License
295 stars 27 forks source link

Attribute Error when Using Multiprocessing #497

Open ZachPence opened 11 months ago

ZachPence commented 11 months ago

Hello, The following error is thrown when I attempt to use the multiprocessing module when the function contains Galois fields:

Exception in thread Thread-8: <-- Author's Note: Number varies. This is just an example
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\threading.py", line 973, in _bootstrap_inner
    self.run()
  File "C:\ProgramData\Anaconda3\lib\threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 576, in _handle_results
    task = get()
  File "C:\ProgramData\Anaconda3\lib\multiprocessing\connection.py", line 256, in recv
    return _ForkingPickler.loads(buf.getbuffer())
AttributeError: Can't get attribute 'FieldArray_3_2' on <module 'galois._fields._factory' from '{directory to Python files}\\Python\\Python39\\site-packages\\galois\\_fields\\_factory.py'>

Below is an example demonstrating the point:

from galois import GF, Poly
from multiprocessing import Pool # Issue also arises with the "multiprocess" module

def f(inputValue, galoisField):
    # Do stuff with field
    return inputValue

def g(inputValue, fieldSize):
    GFp = GF(fieldSize)
    poly = Poly((inputValue,1), GFp)
    return poly

def main():
    p = 2 # Works just fine with binary
    # p = 3 # Switching to a non-binary field causes issues
    inputs = (1,2,3,4,5)

    # Passing in a galois field as an argument
    with Pool(2) as pool:
        results = [pool.apply_async(f, args=(i,GF(p))) for i in inputs]
        outputs = [r.get() for r in results]
    print(f"Output from f: {outputs}")

    # Constructing galois field inside the function
    with Pool(2) as pool:
       results = [pool.apply_async(g, args=(j,p)) for j in range(p)]
       outputs = [r.get() for r in results]
    print(f"Output from g: {outputs}")

if __name__ == "__main__":
    main()

Key Points:

Things I have tried:

Things I have not tried:

I do not know whether if this is an issue of Galois, multiprocessing, how they interact, or if it is on my end/human error. Link to a similar issue: https://github.com/mhostetter/galois/issues/388

raeudigerRaeffi commented 11 months ago

I have encountered the same problem, my current workaround is to also construct the field again in the called functions by passing the degree and characteristic, however the average time to reconstruct a field is around 0.2 - 0.7 secs which defeats the purpose.

mhostetter commented 8 months ago

@ZachPence sorry for the delayed reply.

Creating GFp = GF(p) once and then using GFp inside of with Pool(2) as pool: seems to re solve the issue. Is that satisfactory to you?

from multiprocessing import Pool  # Issue also arises with the "multiprocess" module

from galois import GF, Poly

def f(inputValue, galoisField):
    # Do stuff with field
    return inputValue

def g(inputValue, fieldSize):
    GFp = GF(fieldSize)
    poly = Poly((inputValue, 1), GFp)
    return poly

def main():
    # p = 2  # Works just fine with binary
    p = 3  # Switching to a non-binary field causes issues
    inputs = (1, 2, 3, 4, 5)

    # Passing in a galois field as an argument
    GFp = GF(p)
    with Pool(2) as pool:
        results = [pool.apply_async(f, args=(i, GFp)) for i in inputs]
        outputs = [r.get() for r in results]
    print(f"Output from f: {outputs}")

    # Constructing galois field inside the function
    with Pool(2) as pool:
        results = [pool.apply_async(g, args=(j, p)) for j in range(p)]
        outputs = [r.get() for r in results]
    print(f"Output from g: {outputs}")

if __name__ == "__main__":
    main()
$ python3 issue_497.py 
Output from f: [1, 2, 3, 4, 5]
Output from g: [Poly(1, GF(3)), Poly(x + 1, GF(3)), Poly(2x + 1, GF(3))]