sdannels / synthetic_time_series_forecasting

Generating synthetic time-series from DoppelGANger model and testing forecasting performance
0 stars 1 forks source link

Need to fix following error: multiprocessing_context = 'spawn' #1

Open noblenav68 opened 1 year ago

noblenav68 commented 1 year ago

Hello, Trying to do an initial run of the code and getting the following at the bottom of the below error message:

ValueError: multiprocessing_context option should specify a valid start method in ['spawn'], but got multiprocessing_context='fork'

Can you please advise how/where I should be setting the 'multiprocessing_context' = 'spawn' so this will run? Thanks! Jon


ValueError Traceback (most recent call last) Cell In[9], line 19 2 model = DGAN(DGANConfig(
3 # length of training examples and of generated synthetic data 4 max_sequence_len=features.shape[1], (...) 15 epochs=2000, 16 )) 18 # train model ---> 19 model.train_numpy( 20 attributes = attributes, 21 features = features, 22 feature_types= [OutputType.CONTINUOUS] features.shape[2], 23 attribute_types = [OutputType.DISCRETE] attributes.shape[1], 24 )

File c:\Users\noble\anaconda3\envs\py3_9\lib\site-packages\gretel_synthetics\timeseries_dgan\dgan.py:343, in DGAN.train_numpy(self, features, feature_types, attributes, attribute_types, progress_callback) 335 raise ValueError(f"NaN found in internal attributes. {NAN_ERROR_MESSAGE}") 337 dataset = TensorDataset( 338 torch.Tensor(internal_attributes), 339 torch.Tensor(internal_additional_attributes), 340 torch.Tensor(internal_features), 341 ) --> 343 self._train(dataset, progress_callback=progress_callback) ... 404 'multiprocessing_context={!r}').format(valid_start_methods, multiprocessing_context)) 405 multiprocessing_context = multiprocessing.get_context(multiprocessing_context) 407 if not isinstance(multiprocessing_context, python_multiprocessing.context.BaseContext):

ValueError: multiprocessing_context option should specify a valid start method in ['spawn'], but got multiprocessing_context='fork'

sdannels commented 1 year ago

Hi Jon,

I ran this in Google Colab notebooks, which run on Ubuntu. The multiprocessing start method "fork" is supported on Linux but not on Windows according this article which gives a lot more details: https://superfastpython.com/multiprocessing-pool-context/.

That article also gives ways to overwrite the default start method. You could try using that to switch the start method to "spawn", but I don't fully know what effect that would have on the rest of the code in terms of replication.

An easier solution may just be to run the code in a Colab notebook. They are free with a Google account, and they offer free GPU run times, which were necessary for some of the training tasks.