Open kyao opened 5 years ago
The d3m.primitives.feature_construction.corex_text.CorexText does not add column metadata. This causes wrapped SKLearn primitives to crash.
Pdb) primitives_outputs['steps.4.produce'].shape (771, 29) (Pdb) primitives_outputs['steps.4.produce'].head() d3mIndex Player Number_seasons Games_played At_bats ... Position_nan Hall_of_Fame_1 Hall_of_Fame_0 Hall_of_Fame_2 Hall_of_Fame_nan 0 0 HANK_AARON 23 3298 12364 ... 0 1 0 0 0 1 1 JERRY_ADAIR 13 1165 4019 ... 0 0 1 0 0 2 4 JOE_ADCOCK 17 1959 6606 ... 0 0 1 0 0 3 5 TOMMIE_AGEE 12 1129 3912 ... 0 0 1 0 0 4 6 LUIS_AGUAYO 10 568 1104 ... 0 0 1 0 0 [5 rows x 29 columns] (Pdb) primitives_outputs['steps.5.produce'].shape (771, 33) (Pdb) primitives_outputs['steps.5.produce'].head() d3mIndex Number_seasons Games_played At_bats Runs Hits Doubles ... Hall_of_Fame_2 Hall_of_Fame_nan corex_0 corex_1 corex_2 corex_3 corex_4 0 0 23 3298 12364 2174 3771 624 ... 0 0 0.348001 0.496024 0.494908 0.399449 0.515824 1 1 13 1165 4019 378 1022 163 ... 0 0 0.448118 0.911324 0.259670 0.250383 0.492069 2 4 17 1959 6606 823 1832 295 ... 0 0 0.216636 0.308273 0.448800 0.499551 0.386128 3 5 12 1129 3912 558 999 170 ... 0 0 0.514805 0.175734 0.741300 0.566758 0.526204 4 6 10 568 1104 142 260 43 ... 0 0 0.445903 0.638172 0.489961 0.506785 0.222282 [5 rows x 33 columns] (Pdb) primitives_outputs['steps.5.produce'].metadata.query((metadata_base.ALL_ELEMENTS, 27)) <FrozenOrderedDict OrderedDict([('structural_type', <class 'int'>), ('semantic_types', ('http://schema.org/Integer', 'https://metadata.datadrivendiscovery.org/types/Attribute'))])> (Pdb) primitives_outputs['steps.5.produce'].metadata.query((metadata_base.ALL_ELEMENTS, 28)) <FrozenOrderedDict OrderedDict()> (Pdb) primitives_outputs['steps.5.produce'].metadata.query((metadata_base.ALL_ELEMENTS, 29)) <FrozenOrderedDict OrderedDict()> (Pdb) primitives_outputs['steps.5.produce'].metadata.query((metadata_base.ALL_ELEMENTS, )) <FrozenOrderedDict OrderedDict([('dimension', <FrozenOrderedDict OrderedDict([('name', 'columns'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularColumn',)), ('length', 33)])>)])>
The d3m.primitives.feature_construction.corex_text.CorexText does not add column metadata. This causes wrapped SKLearn primitives to crash.