georgian-io-archive / foreshadow

An automatic machine learning system
https://foreshadow.readthedocs.io
Apache License 2.0
29 stars 2 forks source link

Make ColumnSharer and Config Process Safe (low priority) #130

Closed cchoquette closed 4 years ago

cchoquette commented 5 years ago

Description

As we all know and love, threads have shared memory. Unfortunately, processes don't and this pipeline leverages joblib to run multiple processes for separate columns (see parallelprocessor). Consequently, the shared dicts of config and columnsharer will be copied to each memory space and will not be updated in sync. This problem will need to be fixed for the full integration by replacing the builtin dicts with processor safe versions.

Estimate: 2 day

cchoquette commented 5 years ago

https://github.com/_render_node/MDE3OlB1bGxSZXF1ZXN0UmV2aWV3MjcwMjc5NTg0/pull_request_reviews/more_threads convo: #111 (comment)

cchoquette commented 5 years ago

See branch: https://github.com/georgianpartners/foreshadow/tree/manager You can test by using: python -m foreshadow/config.py and debug (using the test): foreshadow/tests/test_config.test_get_config_only_sys (using pycharm).

jzhang-gp commented 4 years ago

ColumnSharer is process safe now.