usc-isi-i2 / dsbox-ta2

The DSBox TA2 component
MIT License
11 stars 6 forks source link

CPUs tag inconfig doesn't seem to work #76

Closed serbanstan closed 6 years ago

serbanstan commented 6 years ago

Running on LL0_198_delta with 1 CPU gives the following (incomplete) stack trace

(dsbox-devel-710) [stan@dsbox01 python]$ python ta2-search /nas/home/stan/dsbox/runs2/config-ll0/LL0_198_delta_elevators_config.json 
Namespace(configuration_file='/nas/home/stan/dsbox/runs2/config-ll0/LL0_198_delta_elevators_config.json', cpus=-1, debug=False, output_prefix=None, timeout=-1)
Using configuation:
{'cpus': 1,
 'dataset_schema': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL0/LL0_198_delta_elevators/LL0_198_delta_elevators_dataset/datasetDoc.json',
 'executables_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL0_198_delta_elevators/executables',
 'pipeline_logs_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL0_198_delta_elevators/logs',
 'problem_root': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL0/LL0_198_delta_elevators/LL0_198_delta_elevators_problem',
 'problem_schema': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL0/LL0_198_delta_elevators/LL0_198_delta_elevators_problem/problemDoc.json',
 'ram': '10Gi',
 'saved_pipeline_ID': '',
 'saving_folder_loc': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL0_198_delta_elevators',
 'temp_storage_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL0_198_delta_elevators/temp',
 'timeout': 30,
 'training_data_root': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL0/LL0_198_delta_elevators/LL0_198_delta_elevators_dataset'}
[INFO] No test data config found! Will split the data.
[INFO] - dsbox.controller.controller - Top level output directory: /nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL0_198_delta_elevators
[INFO] Succesfully parsed test data
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 7614}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 7614)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 1903}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 1903)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
[INFO] Template choices:
Template ' Default_regression_template ' has been added to template base.
[INFO] Worker started, id: <_MainProcess(MainProcess, started)>
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-07-17 16:28:20.655767: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
[INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', -968740959767147327)
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', -968740959767147327)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 4147298516430694076)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -6004597711758254217)
[INFO] Push@cache: ('d3m.primitives.data.ColumnParser', -935907445485532058)
[INFO] Push@cache: ('d3m.primitives.data.CastToType', 1344066866244125423)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKImputer', -7400627586935899451)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKARDRegression', 5617343449666356421)

But, taking a look at the system load

screen shot 2018-07-17 at 4 28 49 pm
proska commented 6 years ago

We have already implemented all the controls we can have on the number of workers. The output of "top" command that @serbanstan have posted is just showing the number of tasks that have been spawned (not the number of workers) which is not showing we use extra CPU resources. I am closing this issue for now until we have more information on how sklearn primitives spawn tasks/processes.