Closed isilber closed 3 years ago
What you have to do is place the code running EMC2 into an if/main statement like this:
if name == 'main': ds = emc2.simulator (bla bla)
Otherwise, if you try to do parallel, it will give you the error above.
Thanks! name and 'main' being? Shouldn't we implement it as default when calling dask?
name is the name of the procedure that you are currently in. So if you are not in the main procedure, then the code will not execute. main is the name of the default procedure that is called when your Python program first starts. Sadly, because whenever you enter a module, you exit the main procedure, there is no way to implement this line by default in the code. The only way is to warn the user to, when starting a parallel task, to ensure that this if statement surrounds their top level procedure of their code. A lot of the time, a good way to design a script with this in mind is:
Import numpy as np
def my_program(): stuff
if name == “main”: my_program()
If we put all of our code in my_program(), this will ensure that this error never pops up.
Bobby
From: isilber @.> Sent: Monday, May 10, 2021 10:49 AM To: columncolab/EMC2 @.> Cc: Jackson, Robert @.>; Comment @.> Subject: Re: [columncolab/EMC2] Issue in parallel processing of large datasets (#42)
Thanks! name and 'main' being? Shouldn't we implement it as default when calling dask?
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/columncolab/EMC2/issues/42#issuecomment-836866199, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFIQA5A4EVBSGAR3S3EKD23TM757BANCNFSM44LAYNJQ.
Got it!
When I try to process in parallel a large dataset (~1500 time steps; 10 subcolumns) I receive the following error causing the simulator to crash (enter what seems to be an endless loop) even if the 'chunks' option is used with rather small chunks. No issues if parallel is False.