usc-isi-i2 / dsbox-ta2

The DSBox TA2 component
MIT License
11 stars 6 forks source link

our choice of template breaks on LL1_twitter_sentiment - July 13 #68

Closed serbanstan closed 6 years ago

serbanstan commented 6 years ago
(dsbox-devel-710) [stan@dsbox01 python]$ python ta2-search /nas/home/stan/dsbox/runs2/config-ll1/LL1_airline_twitter_sentiment_config.json
^[Namespace(configuration_file='/nas/home/stan/dsbox/runs2/config-ll1/LL1_airline_twitter_sentiment_config.json', cpus=-1, debug=False, output_prefix=None, timeout=-1)
Using configuation:
{'cpus': '10',
 'dataset_schema': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL1/LL1_airline_twitter_sentiment/LL1_airline_twitter_sentiment_dataset/datasetDoc.json',
 'executables_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL1_airline_twitter_sentiment/executables',
 'pipeline_logs_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL1_airline_twitter_sentiment/logs',
 'problem_root': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL1/LL1_airline_twitter_sentiment/LL1_airline_twitter_sentiment_problem',
 'problem_schema': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL1/LL1_airline_twitter_sentiment/LL1_airline_twitter_sentiment_problem/problemDoc.json',
 'ram': '10Gi',
 'saved_pipeline_ID': '',
 'saving_folder_loc': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL1_airline_twitter_sentiment',
 'temp_storage_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/LL1_airline_twitter_sentiment/temp',
 'timeout': 9,
 'training_data_root': '/nfs1/dsbox-repo/data/datasets/training_datasets/LL1/LL1_airline_twitter_sentiment/LL1_airline_twitter_sentiment_dataset'}
[INFO] No test data config found! Will split the data.
[INFO] Succesfully parsed test data
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 10248}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 10248)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 4392}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 4392)])>,
 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
                    'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
 'structural_type': <class 'd3m.container.pandas.DataFrame'>}
[INFO] Template choices:
Template ' Test_classification_template ' has been added to template base.
[INFO] Worker started, id: <_MainProcess(MainProcess, started)>
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', -8538829364442312594)
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', -8538829364442312594)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 8039099151687454923)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -5328526570018433364)
[INFO] Push@cache: ('d3m.primitives.dsbox.Profiler', -7858261183674461713)
[INFO] Push@cache: ('d3m.primitives.dsbox.CleaningFeaturizer', -7858261183674461713)
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/re.py:212: FutureWarning: split() requires a non-empty pattern match.
  return _compile(pattern, flags).split(string, maxsplit)
[INFO] Push@cache: ('d3m.primitives.dsbox.CorexText', -1486110526518434158)
[INFO] Push@cache: ('d3m.primitives.dsbox.Encoder', -4770091043159596021)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKImputer', -4770091043159596021)
[INFO] Push@cache: ('d3m.primitives.dsbox.IQRScaler', -8726274218306529856)
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 383, in evaluate_pipeline
    evaluation_result = self._evaluate(configuration, cache)
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 400, in _evaluate
    fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset])
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 92, in fit
    self.runtime.fit(**arguments)
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/runtime.py", line 199, in fit
    primitives_outputs[n_step].copy(), model)
  File "<string>", line 2, in __setitem__
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/managers.py", line 756, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/nfs1/dsbox-repo/stan/d3m/d3m/primitive_interfaces/base.py", line 710, in __getstate__
    'params': self.get_params(),
  File "/nfs1/dsbox-repo/stan/dsbox-cleaning/dsbox/datapreprocessing/cleaner/IQRScaler.py", line 160, in get_params
    raise ValueError("Fit not performed.")
ValueError: Fit not performed.
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 274, in setup_initial_candidate
    candidate.data.update(result)
TypeError: 'NoneType' object is not iterable
--------------------
[ERROR] Initial Pipeline failed, Trying a random pipeline ...
[INFO] Worker started, id: <_MainProcess(MainProcess, started)>
[INFO] Hit@cache: ('d3m.primitives.dsbox.Denormalize', -8538829364442312594)
[INFO] Hit@cache: ('d3m.primitives.datasets.DatasetToDataFrame', -8538829364442312594)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 8039099151687454923)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -5328526570018433364)
[INFO] Hit@cache: ('d3m.primitives.dsbox.Profiler', -7858261183674461713)
[INFO] Hit@cache: ('d3m.primitives.dsbox.CleaningFeaturizer', -7858261183674461713)
[INFO] Push@cache: ('d3m.primitives.dsbox.CorexText', -5167848818974117357)
[INFO] Push@cache: ('d3m.primitives.dsbox.DoNothing', -7684377629081729978)
[INFO] Push@cache: ('d3m.primitives.dsbox.IterativeRegressionImputation', -7684377629081729978)
[INFO] Push@cache: ('d3m.primitives.dsbox.IQRScaler', -4600054221552837017)
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 383, in evaluate_pipeline
    evaluation_result = self._evaluate(configuration, cache)
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 400, in _evaluate
    fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset])
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 92, in fit
    self.runtime.fit(**arguments)
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/runtime.py", line 199, in fit
    primitives_outputs[n_step].copy(), model)
  File "<string>", line 2, in __setitem__
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/managers.py", line 756, in _callmethod
    conn.send((self._id, methodname, args, kwds))
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/nfs1/dsbox-repo/stan/d3m/d3m/primitive_interfaces/base.py", line 710, in __getstate__
    'params': self.get_params(),
  File "/nfs1/dsbox-repo/stan/dsbox-cleaning/dsbox/datapreprocessing/cleaner/IQRScaler.py", line 160, in get_params
    raise ValueError("Fit not performed.")
ValueError: Fit not performed.
Traceback (most recent call last):
  File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 274, in setup_initial_candidate
    candidate.data.update(result)
TypeError: 'NoneType' object is not iterable
--------------------
[ERROR] Initial Pipeline failed, Trying a random pipeline ...
kyao commented 6 years ago

Which pipeline should we use for this dataset?

RqS commented 6 years ago

Do not see it again