McMasterAI / CoviDash

1 stars 2 forks source link

Create multiprocessing for training models #43

Closed cczarnuch closed 4 years ago

cczarnuch commented 4 years ago

Add multiprocessing support to train models for multiple locations at a time.

Ideally, you can calculate the number of processes to use like so multiprocessing.cpu_count() / 2. Lines 178 to 210 can be extracted into a function and that function will be multiprocessed.

There are several dictionaries that store information that need to be used in the training.

  1. inout_locations - change line 182 to use the dictionary entity of inout_seq instead of the variable for readability.
  2. normalized_data - normalized case data per date.
  3. scaler_dict - contains the scaler info.
  4. locations_dict - index 0 contains the unique dates and index 1 contains the counts of cases on those dates. Each of these dicts uses the location name as the key for the data.

If you think there is a better wat to organize the data then you can change it.

cczarnuch commented 4 years ago

@jaymody Updated the description of the story

jaymody commented 4 years ago

@cczarnuch The current version of the CR1-Predictions branch gives me this error when I try to run predictions.py (which was also flagged by pylint):

Traceback (most recent call last):
  File "predictions.py", line 226, in <module>
    main()
  File "predictions.py", line 125, in main
    predictions = pp.denormalize_data(normalized_preds)
TypeError: denormalize_data() missing 1 required positional argument: 'scaler'

Wondering if this is fixed in one of your other branches. If so, we should merge that before I continue to avoid conflicts and repeats.