Closed proska closed 6 years ago
@proska can you please try to fix it? Tanay wrote the code originally, but he is on vacation until next Tuesday
Probably pulling latest will work. I have already fix it I think
On Thu, Jul 19, 2018 at 11:31 AM Ehsan Qasemi notifications@github.com wrote:
Running command:
python ta2-search ~/dsbox/runs2/config-seed-test/LL0_690_visualizing_galaxy/search_config.json
Error message:
/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_extractor.py:408: UserWarning: DateExtractor: Failed to set timezone as America/Los_Angeles. Catch offset must be a timedelta representing a whole number of minutes, not datetime.timedelta(-1, 58022). warn('DateExtractor: Failed to set timezone as ' + str(self.default_tz) + '. Catch ' + str(e)) Traceback (most recent call last): File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 488, in evaluate_pipeline evaluation_result = self._evaluate(configuration, cache, dump2disk) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 508, in _evaluate fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset]) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 94, in fit self.runtime.fit(arguments) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/runtime.py", line 194, in fit primitive_arguments File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/runtime.py", line 282, in _primitive_step_fit produce_result = model.produce(produce_params) File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/data_profile.py", line 169, in produce cols = self._DateFeaturizer.detect_date_columns(self._sample_df) File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_featurizer_org.py", line 100, in detect_date_columns if self._parse_column(sampled_df, idx) is not None: File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_featurizer_org.py", line 303, in _parse_column warn("Warning: multiple dates detected in column: " + idx) TypeError: must be str, not int
Full Output:
(d3m-devel) [qasemi@dsbox02 python]$ python ta2-search ~/dsbox/runs2/config-seed-test/LL0_690_visualizing_galaxy/search_config.json Namespace(configuration_file='/nas/home/qasemi/dsbox/runs2/config-seed-test/LL0_690_visualizing_galaxy/search_config.json', cpus=-1, debug=False, output_prefix=None, timeout=-1) Using configuation: {'dataset_schema': '/nfs1/dsbox-repo/data/datasets-v31/training_datasets/LL0/LL0_690_visualizing_galaxy/LL0_690_visualizing_galaxy_dataset/datasetDoc.json', 'executables_root': '/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/output/LL0_690_visualizing_galaxy/executables', 'pipeline_logs_root': '/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/output/LL0_690_visualizing_galaxy/logs', 'problem_root': '/nfs1/dsbox-repo/data/datasets-v31/training_datasets/LL0/LL0_690_visualizing_galaxy/LL0_690_visualizing_galaxy_problem', 'problem_schema': '/nfs1/dsbox-repo/data/datasets-v31/training_datasets/LL0/LL0_690_visualizing_galaxy/LL0_690_visualizing_galaxy_problem/problemDoc.json', 'saved_pipeline_ID': '', 'saving_folder_loc': '/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/output/LL0_690_visualizing_galaxy', 'temp_storage_root': '/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/output/LL0_690_visualizing_galaxy/temp', 'timeout': 0, 'training_data_root': '/nfs1/dsbox-repo/data/datasets-v31/training_datasets/LL0/LL0_690_visualizing_galaxy/LL0_690_visualizing_galaxy_dataset', 'user_problems_root': '/nas/home/qasemi/dsbox/runs2/output-seed/LL0_690_visualizing_galaxy/user_problems'} [INFO] No test data config found! Will split the data. [INFO] - dsbox.controller.controller - Top level output directory: /nfs1/dsbox-repo/qasemi/dsbox-ta2/python/output/LL0_690_visualizing_galaxy [INFO] Succesfully parsed test data {'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 223}} {'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 223)])>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'structural_type': <class 'd3m.container.pandas.DataFrame'>} {'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 100}} {'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 100)])>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'structural_type': <class 'd3m.container.pandas.DataFrame'>} [INFO] Template choices: Template ' Default_regression_template ' has been added to template base. [INFO] Template 0:Default_regression_template Selected. UCT:[100.0] [INFO] Worker started, id: <_MainProcess(MainProcess, started)> , True /nfs1/dsbox-repo/qasemi/miniconda/envs/d3m-devel/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_converters Using TensorFlow backend. [INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', 6809185080492433979) /nfs1/dsbox-repo/qasemi/miniconda/envs/d3m-devel/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype fromfloat
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_converters Using TensorFlow backend. [INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', 6809185080492433979) [INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -3421672811617791271) [INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -5556157206724971364) [INFO] Push@cache: ('d3m.primitives.dsbox.Profiler', -6948590468534388086) /nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_extractor.py:408: UserWarning: DateExtractor: Failed to set timezone as America/Los_Angeles. Catch offset must be a timedelta representing a whole number of minutes, not datetime.timedelta(-1, 58022). warn('DateExtractor: Failed to set timezone as ' + str(self.default_tz) + '. Catch ' + str(e)) Traceback (most recent call last): File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 488, in evaluate_pipeline evaluation_result = self._evaluate(configuration, cache, dump2disk) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 508, in _evaluate fitted_pipeline.fit(cache=cache, inputs=[self.train_dataset]) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/pipeline/fitted_pipeline.py", line 94, in fit self.runtime.fit(arguments) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/runtime.py", line 194, in fit primitive_arguments File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/runtime.py", line 282, in _primitive_step_fit produce_result = model.produce(produce_params) File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/data_profile.py", line 169, in produce cols = self._DateFeaturizer.detect_date_columns(self._sample_df) File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_featurizer_org.py", line 100, in detect_date_columns if self._parse_column(sampled_df, idx) is not None: File "/nfs1/dsbox-repo/qasemi/dsbox-cleaning/dsbox/datapreprocessing/cleaner/dependencies/date_featurizer_org.py", line 303, in _parse_column warn("Warning: multiple dates detected in column: " + idx) TypeError: must be str, not int [INFO] push@Candidate: (6628973257765379899,None) Traceback (most recent call last): File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 377, in setup_initial_candidate candidate.data.update(result) TypeError: 'NoneType' object is not iterable [ERROR] Initial Pipeline failed, Trying a random pipeline ... {'clean_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.CleaningFeaturizer'}, 'corex_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.CorexText'}, 'denormalize_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Denormalize'}, 'encoder_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Encoder'}, 'extract_attribute_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Attribute',)}, 'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'}, 'extract_target_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Target', 'https://metadata.datadrivendiscovery.org/types/SuggestedTarget')}, 'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'}, 'impute_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.sklearn_wrap.SKImputer'}, 'model_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.sklearn_wrap.SKARDRegression'}, 'profiler_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Profiler'}, 'to_dataframe_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.datasets.DatasetToDataFrame'}}[INFO] hit@Candidate: (6628973257765379899,None) Traceback (most recent call last): File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 371, in setup_initial_candidate raise ValueError("Candidate is not compatible with the dataset") ValueError: Candidate is not compatible with the dataset [ERROR] Initial Pipeline failed, Trying a random pipeline ... {'clean_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.CleaningFeaturizer'}, 'corex_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.CorexText'}, 'denormalize_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Denormalize'}, 'encoder_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Encoder'}, 'extract_attribute_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Attribute',)}, 'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'}, 'extract_target_step': {'hyperparameters': {'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Target', 'https://metadata.datadrivendiscovery.org/types/SuggestedTarget')}, 'primitive': 'd3m.primitives.data.ExtractColumnsBySemanticTypes'}, 'impute_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.sklearn_wrap.SKImputer'}, 'model_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.sklearn_wrap.SKARDRegression'}, 'profiler_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.dsbox.Profiler'}, 'to_dataframe_step': {'hyperparameters': {}, 'primitive': 'd3m.primitives.datasets.DatasetToDataFrame'}}
Traceback (most recent call last): File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/controller/controller.py", line 528, in train cache_bundle=(cache, candidate_cache), File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/controller/controller.py", line 375, in search_template report = search.search_one_iter(candidate_in=candidate, cache_bundle=cache_bundle) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 191, in search_one_iter self.setup_initial_candidate(candidate_in, cache, candidate_cache) File "/nfs1/dsbox-repo/qasemi/dsbox-ta2/python/dsbox/template/search.py", line 386, in setup_initial_candidate raise ValueError("Invalid initial candidate") ValueError: Invalid initial candidate
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/usc-isi-i2/dsbox-ta2/issues/87, or mute the thread https://github.com/notifications/unsubscribe-auth/AGPpwa3jKvrCc7MMGOUimkyfiStKXF_uks5uINBigaJpZM4VW4o- .
Just update cleaning Now it should be fixed
fixed in d95d73494628e842ebbd2e368faa5f0cddf648e9
Running command:
Error message:
Full Output: