Supporting files in /dsbox_efs/runs/seed-serban/32_wikiqa
Namespace(configuration_file='/dsbox_efs/config/seed-41/partition-14/32_wikiqa/search_config.json', cpus=10, debug=False, output_prefix=None, timeout=55)
Using configuation:
{'cpus': 10,
'dataset_schema': '/dsbox_efs/dataset/seed_datasets_current/32_wikiqa/32_wikiqa_dataset/datasetDoc.json',
'executables_root': '/dsbox_efs/runs/seed/32_wikiqa/executables',
'pipeline_logs_root': '/dsbox_efs/runs/seed/32_wikiqa/pipelines',
'problem_root': '/dsbox_efs/dataset/seed_datasets_current/32_wikiqa/32_wikiqa_problem',
'problem_schema': '/dsbox_efs/dataset/seed_datasets_current/32_wikiqa/32_wikiqa_problem/problemDoc.json',
'temp_storage_root': '/dsbox_efs/runs/seed/32_wikiqa/supporting_files',
'timeout': 55,
'training_data_root': '/dsbox_efs/dataset/seed_datasets_current/32_wikiqa/32_wikiqa_dataset',
'user_problems_root': '/dsbox_efs/runs/seed/32_wikiqa/user_problems'}
[INFO] No test data config found! Will split the data.
[INFO] - dsbox.controller.controller - Top level output directory: /dsbox_efs/runs/seed/32_wikiqa
[INFO] Template choices:
Template ' SRI_Mean_Baseline_Template ' has been added to template base.
Template ' random_forest_classification_template ' has been added to template base.
Template ' extra_trees_classification_template ' has been added to template base.
Template ' gradient_boosting_classification_template ' has been added to template base.
Template ' svc_classification_template ' has been added to template base.
[INFO] - dsbox.controller.controller - [INFO] Template 0:SRI_Mean_Baseline_Template Selected. UCT:[None, None, None, None, None]
[INFO] Worker started, id: <_MainProcess(MainProcess, started)> , True
/usr/local/lib/python3.6/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Will use normal train-test mode ( n = 1 ) to choose best primitives.
[INFO] Push@cache: ('d3m.primitives.sri.baseline.MeanBaseline', 6822275182001374655)
[INFO] Testing finish.!!!
[INFO] Now in normal mode, will add extra train with train_dataset1
[INFO] Hit@cache: ('d3m.primitives.sri.baseline.MeanBaseline', 6822275182001374655)
[INFO] Now are training the pipeline with all dataset and saving the pipeline.
[INFO] Hit@cache: ('d3m.primitives.sri.baseline.MeanBaseline', 6822275182001374655)
[INFO] push@Candidate: (-9010122329676530035,d7fc3287-2c40-414b-a789-cceb7a7ec85c)
[INFO] - dsbox.controller.controller - ******************
[INFO] Writing results
{'cross_validation_metrics': [],
'fitted_pipeline': <dsbox.pipeline.fitted_pipeline.FittedPipeline object at 0x7f3cb38c2a58>,
'test_metrics': [{'metric': 'f1', 'value': 0.0}],
'total_runtime': 206.4727053642273,
'training_metrics': [{'metric': 'f1', 'value': 0.0}]}
[INFO] - dsbox.controller.controller - {'fitted_pipeline': <dsbox.pipeline.fitted_pipeline.FittedPipeline object at 0x7f3cb38c2a58>, 'training_metrics': [{'metric': 'f1', 'value': 0.0}], 'cross_validation_metrics': [], 'test_metrics': [{'metric': 'f1', 'value': 0.0}], 'total_runtime': 206.4727053642273} 0.0
[INFO] - dsbox.controller.controller - Training f1 = 0.0
[INFO] - dsbox.controller.controller - Validation f1 = 0.0
[INFO] - dsbox.controller.controller - ******************
[INFO] Saving training results in /dsbox_efs/runs/seed/32_wikiqa.txt
[INFO] - dsbox.controller.controller - [INFO] report: 0.0
[INFO] - dsbox.controller.controller - [INFO] UCT updated: [12.471775159475127, 122.51516650175721, 122.51516650175721, 122.51516650175721, 122.51516650175721]
[INFO] - dsbox.controller.controller - [INFO] cache size: 1, candidates: 1
[INFO] - dsbox.controller.controller - [INFO] Template 1:random_forest_classification_template Selected. UCT:[12.471775159475127, 122.51516650175721, 122.51516650175721, 122.51516650175721, 122.51516650175721]
[INFO] Worker started, id: <_MainProcess(MainProcess, started)> , True
[INFO] Will use cross validation( n = 10 ) to choose best primitives.
[INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', 6822275182001374655)
[INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', 6822275182001374655)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 5562649659548626099)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -5267140046402702466)
[INFO] Push@cache: ('d3m.primitives.dsbox.Profiler', 903219221365482512)
Traceback (most recent call last):
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/dependencies/dtype_detector.py", line 47, in detector
if dtype.loc[True][0] == temp.dropna().shape[0]:
File "/usr/local/lib/python3.6/dist-packages/pandas/core/series.py", line 623, in __getitem__
result = self.index.get_value(self, key)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2560, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/_libs/index.pyx", line 83, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 91, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 811, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 817, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/search.py", line 523, in evaluate_pipeline
evaluation_result = self._evaluate(configuration, cache, dump2disk)
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/search.py", line 546, in _evaluate
fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset1])
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 94, in fit
self.runtime.fit(**arguments)
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/runtime.py", line 195, in fit
primitive_arguments
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/runtime.py", line 284, in _primitive_step_fit
produce_result = model.produce(**produce_params)
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/data_profile.py", line 160, in produce
inputs = dtype_detector.detector(inputs)
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/dependencies/dtype_detector.py", line 57, in detector
_logger.error(traceback.print_exc(e))
File "/usr/lib/python3.6/traceback.py", line 159, in print_exc
print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
File "/usr/lib/python3.6/traceback.py", line 100, in print_exception
type(value), value, tb, limit=limit).format(chain=chain):
File "/usr/lib/python3.6/traceback.py", line 497, in __init__
capture_locals=capture_locals)
File "/usr/lib/python3.6/traceback.py", line 332, in extract
if limit >= 0:
TypeError: '>=' not supported between instances of 'KeyError' and 'int'
[INFO] push@Candidate: (2230929946316391055,None)
Traceback (most recent call last):
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/search.py", line 378, in setup_initial_candidate
candidate.data.update(result)
TypeError: 'NoneType' object is not iterable
[ERROR] Initial Pipeline failed, Trying a random pipeline ...
{'clean_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.dsbox.CleaningFeaturizer'},
'corex_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.dsbox.CorexText'},
'denormalize_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.dsbox.Denormalize'},
'encoder_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.dsbox.Encoder'},
'extract_attribute_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Attribute',)},
'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'},
'extract_target_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Target',
'https://metadata.datadrivendiscovery.org/types/SuggestedTarget')},
'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'},
'impute_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.sklearn_wrap.SKImputer'},
'model_step': {'hyperparameters': {'bootstrap': True,
'max_depth': 15,
'max_features': 'auto',
'min_samples_leaf': 1,
'min_samples_split': 2,
'n_estimators': 10},
'primitive': 'd3m.primitives.sklearn_wrap.SKRandomForestClassifier'},
'profiler_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.dsbox.Profiler'},
'to_dataframe_step': {'hyperparameters': {},
'primitive': 'd3m.primitives.datasets.DatasetToDataFrame'}}
--------------------
[INFO] Worker started, id: <_MainProcess(MainProcess, started)> , True
[INFO] Will use cross validation( n = 10 ) to choose best primitives.
[INFO] Hit@cache: ('d3m.primitives.dsbox.Denormalize', 6822275182001374655)
[INFO] Hit@cache: ('d3m.primitives.datasets.DatasetToDataFrame', 6822275182001374655)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 5562649659548626099)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -5267140046402702466)
[INFO] Push@cache: ('d3m.primitives.dsbox.Profiler', 903219221365482512)
Traceback (most recent call last):
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/dependencies/dtype_detector.py", line 47, in detector
if dtype.loc[True][0] == temp.dropna().shape[0]:
File "/usr/local/lib/python3.6/dist-packages/pandas/core/series.py", line 623, in __getitem__
result = self.index.get_value(self, key)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2560, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/_libs/index.pyx", line 83, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 91, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 811, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 817, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/search.py", line 523, in evaluate_pipeline
evaluation_result = self._evaluate(configuration, cache, dump2disk)
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/search.py", line 546, in _evaluate
fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset1])
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 94, in fit
self.runtime.fit(**arguments)
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/runtime.py", line 195, in fit
primitive_arguments
File "/user_opt/dsbox/dsbox-ta2/python/dsbox/template/runtime.py", line 284, in _primitive_step_fit
produce_result = model.produce(**produce_params)
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/data_profile.py", line 160, in produce
inputs = dtype_detector.detector(inputs)
File "/user_opt/dsbox/src/dsbox-datacleaning/dsbox/datapreprocessing/cleaner/dependencies/dtype_detector.py", line 57, in detector
_logger.error(traceback.print_exc(e))
File "/usr/lib/python3.6/traceback.py", line 159, in print_exc
print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
File "/usr/lib/python3.6/traceback.py", line 100, in print_exception
type(value), value, tb, limit=limit).format(chain=chain):
File "/usr/lib/python3.6/traceback.py", line 497, in __init__
capture_locals=capture_locals)
File "/usr/lib/python3.6/traceback.py", line 332, in extract
if limit >= 0:
TypeError: '>=' not supported between instances of 'KeyError' and 'int'
Supporting files in
/dsbox_efs/runs/seed-serban/32_wikiqa