usc-isi-i2 / dsbox-ta2

The DSBox TA2 component
MIT License
11 stars 6 forks source link

Running DefaultRegressionTemplate gives float division by 0 error #78

Closed serbanstan closed 6 years ago

serbanstan commented 6 years ago

Running python ta2-search /nas/home/stan/dsbox/runs2/config-seed/26_radon_seed_config.json give the following error

(dsbox-devel-710) [stan@dsbox01 python]$ python ta2-search /nas/home/stan/dsbox/runs2/config-seed/26_radon_seed_config.json 
Namespace(configuration_file='/nas/home/stan/dsbox/runs2/config-seed/26_radon_seed_config.json', cpus=-1, debug=False, output_prefix=None, timeout=-1)
Using configuation:
{'cpus': '10',
 'dataset_schema': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/26_radon_seed/26_radon_seed_dataset/datasetDoc.json',
 'executables_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed/executables',
 'pipeline_logs_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed/logs',
 'problem_root': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/26_radon_seed/26_radon_seed_problem',
 'problem_schema': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/26_radon_seed/26_radon_seed_problem/problemDoc.json',
 'ram': '10Gi',
 'saved_pipeline_ID': '',
 'saving_folder_loc': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed',
 'temp_storage_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed/temp',
 'timeout': 9,
 'training_data_root': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/26_radon_seed/26_radon_seed_dataset'}
[INFO] No test data config found! Will split the data.
[INFO] - dsbox.controller.controller - Top level output directory: /nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed
[INFO] Succesfully parsed test data
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 736}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 736)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 183}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 183)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
[INFO] Template choices:
Template ' Default_regression_template ' has been added to template base.
[INFO] Template 0:Default_regression_template Selected. UCT:[100.0]
[INFO] Worker started, id: <_MainProcess(MainProcess, started)>
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', -4257385545094315085)
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', -4257385545094315085)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 6159901662849939285)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 602794132528371102)
[INFO] Push@cache: ('d3m.primitives.dsbox.Profiler', -5041500832329100334)
/nfs1/dsbox-repo/stan/dsbox-profiling/dsbox/datapreprocessing/profiler/dependencies/date_extractor.py:408: UserWarning: DateExtractor: Failed to set timezone as America/Los_Angeles. Catch offset must be a timedelta representing a whole number of minutes, not datetime.timedelta(-1, 58022).
  warn('DateExtractor: Failed to set timezone as ' + str(self.default_tz) + '. Catch ' + str(e))
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match.
  return _compile(pattern, flags).split(string, maxsplit)
[INFO] Push@cache: ('d3m.primitives.dsbox.CleaningFeaturizer', -5041500832329100334)
[INFO] Push@cache: ('d3m.primitives.dsbox.CorexText', -3470769510771990258)
[INFO] Push@cache: ('d3m.primitives.dsbox.Encoder', -3470769510771990258)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKImputer', 4377620918669683860)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKARDRegression', 3448890783079117633)
******************
[INFO] Writing results
{'fitted_pipeline': <dsbox.pipeline.fitted_pipeline.FittedPipeline object at 0x7f32121b89e8>, 'training_metrics': [{'metric': 'rootMeanSquaredError', 'value': 0.3696254631889981}], 'cross_validation_metrics': [{'metric': 'rootMeanSquaredError', 'value': 0.37622140468531423, 'values': [0.4519165763255966, 0.6912092281686664, 0.5286533519727559, 0.2943245210812523, 0.3448544903188785, 0.27129858126291595, 0.318067342597573, 0.2927429117639273, 0.31950595695903383, 0.24964108640254248], 'targets': []}], 'test_metrics': [{'metric': 'rootMeanSquaredError', 'value': 0.35634406481868686}]}
{'denormalize_step': {'primitive': 'd3m.primitives.dsbox.Denormalize', 'hyperparameters': {}}, 'to_dataframe_step': {'primitive': 'd3m.primitives.datasets.DatasetToDataFrame', 'hyperparameters': {}}, 'extract_attribute_step': {'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes', 'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Attribute',)}}, 'profiler_step': {'primitive': 'd3m.primitives.dsbox.Profiler', 'hyperparameters': {}}, 'clean_step': {'primitive': 'd3m.primitives.dsbox.CleaningFeaturizer', 'hyperparameters': {}}, 'corex_step': {'primitive': 'd3m.primitives.dsbox.CorexText', 'hyperparameters': {}}, 'encoder_step': {'primitive': 'd3m.primitives.dsbox.Encoder', 'hyperparameters': {}}, 'impute_step': {'primitive': 'd3m.primitives.sklearn_wrap.SKImputer', 'hyperparameters': {}}, 'extract_target_step': {'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes', 'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Target', 'https://metadata.datadrivendiscovery.org/types/SuggestedTarget')}}, 'model_step': {'primitive': 'd3m.primitives.sklearn_wrap.SKARDRegression', 'hyperparameters': {}}} 0.35634406481868686
Training rootMeanSquaredError = 0.3696254631889981
CV rootMeanSquaredError = 0.37622140468531423
Test rootMeanSquaredError = 0.35634406481868686
******************
[INFO] Saving training results in /nfs1/dsbox-repo/stan/dsbox-ta2/python/output/26_radon_seed.txt
[INFO] report: 0.35634406481868686
Error in sys.excepthook:
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/theano/gof/link.py", line 79, in thunk_hook
    __excepthook(type, value, trace)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pytypes/util.py", line 821, in _pytypes_excepthook
    traceback.print_exception(exctype, value, tb, _calc_traceback_limit(tb))
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pytypes/util.py", line 765, in _calc_traceback_limit
    if tb2.tb_next.tb_frame.f_code.co_filename.split(os.sep)[-2] == 'pytypes' and not \
IndexError: list index out of range

Original exception was:
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 1377, in eval
    result = get_result(other)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 1346, in get_result
    result = func(values, other)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/ops.py", line 1202, in na_op
    result = expressions.evaluate(op, str_rep, x, y, **eval_kwargs)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 204, in evaluate
    return _evaluate(op, op_str, a, b, **eval_kwargs)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 64, in _evaluate_standard
    return op(a, b)
ZeroDivisionError: float division by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ta2-search", line 141, in <module>
    result = main(args)
  File "ta2-search", line 110, in main
    status = controller.train()
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/controller/controller.py", line 532, in train
    self.update_UCT_score(index=idx, report=report)
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/controller/controller.py", line 450, in update_UCT_score
    (self.normalize.max() - self.normalize.min())
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/ops.py", line 1262, in f
    return self._combine_series(other, na_op, fill_value, axis, level)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/frame.py", line 3944, in _combine_series
    try_cast=try_cast)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/frame.py", line 3958, in _combine_series_infer
    try_cast=try_cast)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/frame.py", line 3981, in _combine_match_columns
    try_cast=try_cast)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 3435, in eval
    return self.apply('eval', **kwargs)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 3329, in apply
    applied = getattr(b, f)(**kwargs)
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 1384, in eval
    result = handle_error()
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/pandas/core/internals.py", line 1367, in handle_error
    (repr(other), str(detail)))  # noqa
TypeError: Could not operate array([0., 0., 0.]) with block values float division by zero

Log files can be found in /nas/home/stan/nfs1-stan/dsbox-ta2/python/output/26_radon_seed

proska commented 6 years ago

@serbanstan Please let me know if you still have this bug after commit 7d6c284002fa09b706198d0d348d7b730094cc1a

serbanstan commented 6 years ago

@proska Initial error is fixed, but now giving d3m.exceptions.NotSupportedError: [ERROR] Save training results Failed! (sys not breaking). Closing this issue and opening one for the new error.