Closed ariymarkowitz closed 10 months ago
There should be nothing commented, the code in the repo accidentally had some commented, I just pushed a fix
The flow is the following
Then,
So in theory you can run the first bit on a mainframe or whatever that doesn't support the rendering, and then just do the (comparatively easy) rendering on a local PC (it uses datashader now so no massive memory usage, plus using dask not pandas means that it just loads CSV files one at a time lazily instead of all at once)
I commented the first step out since I already have 2000 csv files and just wanted to render, didn't mean to commit in that state
In general all of this work is not necessary, but since we have 10000 generators and not 2 or 3 we need to compute a lot more limit points (several orders of magnitude more), so this example is massively more hungry than anything else
I'm still getting the following error:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/amar630/anaconda3/envs/py3-11/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/Users/amar630/anaconda3/envs/py3-11/lib/python3.11/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/amar630/Downloads/bella-main/examples/atom.py", line 50, in one_limit_set
df = G.coloured_limit_set_fast(points_per_walk, seed=seed)
^
NameError: name 'G' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/amar630/Downloads/bella-main/examples/atom.py", line 69, in <module>
_ = pool.starmap(one_limit_set, [[n+1, number_of_walks, points_per_walk, seed] for n in range(number_of_walks)], chunksize=1 )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/amar630/anaconda3/envs/py3-11/lib/python3.11/multiprocessing/pool.py", line 375, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/amar630/anaconda3/envs/py3-11/lib/python3.11/multiprocessing/pool.py", line 774, in get
raise self._value
NameError: name 'G' is not defined
G should be defined on line 65, something like
G = AtomGroup(generators, 1.1)
Oh I see, G is not being passed into one_limit_set
... it works on my machine no idea why, let me think for a minute
What is your OS? Is it linux or something else?
On linux, multiprocessing uses fork
by default so the child process gets every variable of the parent process (it's an exact copy). On windows and (I think) mac, it uses spawn
by default so the child is a new interpreter and everything. I think this is why my subprocess sees G and yours does not.
(Usually I set it to spawn
explicitly but in this particular case I did not. Soon, I think in Python 3.14 or something, it will be the default anyway.)
Even adding multiprocessing.set_start_method('spawn')
I can't repro, but the commit which will show up in 2mins should fix the problem anyway by passing G
explicitly. No idea why it works on my machine since the spawn method doesn't seem to be the problem....
@ariymarkowitz I was missing a "global" directive, I am not sure if forking is available on Mac OSX, it is UNIX in theory so it might, but if it is then this will fix an error. There is also a more sophisticated algorithm for producing the generators that now give reflections in tangent circles not just circles which are almost tangent. Anyway I just pushed these changes.
Looks like it's working now!
I tried uncommenting the commented code, but then I get an error that G is not defined.
I am running Python 3.11.