Doing text mining with text entered in Create Table should be possible by Editing the Domain of the table output to force the text to be interpreted as text (rather than categorical data), then connect Corpus to Edit Domain, and select the text variable as "Used text features"
Actual behavior
Connecting Corpus to Edit Domain results in an error:
Error encountered in widget Corpus:
Traceback (most recent call last):
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/widgets/owcorpus.py", line 336, in update_feature_selection
corpus = self.corpus.copy()
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 481, in copy
c = super().copy()
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/Orange/data/table.py", line 1491, in copy
t = self.__class__(self)
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 71, in __new__
return super().__new__(cls, *args, **kwargs)
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/Orange/data/table.py", line 718, in __new__
return cls.from_table(args[0].domain, args[0], **kwargs)
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 558, in from_table
Corpus.retain_preprocessing(source, c, row_indices)
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 649, in retain_preprocessing
new.text_features = list(filter(None, [
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 650, in
new._find_identical_feature(tf)
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/orangecontrib/text/corpus.py", line 129, in _find_identical_feature
var == feature
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/Orange/data/variable.py", line 418, in __eq__
and var1._compute_value == var2._compute_value
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/Orange/preprocess/transformation.py", line 240, in __eq__
and np.allclose(self.lookup_table, other.lookup_table,
File "", line 180, in allclose
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/numpy/core/numeric.py", line 2265, in allclose
res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
File "", line 180, in isclose
File "/Applications/Orange.app/Contents/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/numpy/core/numeric.py", line 2372, in isclose
xfin = isfinite(x)
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
If the error is ignored and the text variable is selected as "Used text features", Corpus will ignore it and put the default corpus book-excerpts.tab on its output.
Educational version
0.8.0
Orange version
3.37.0
Expected behavior
Doing text mining with text entered in Create Table should be possible by Editing the Domain of the table output to force the text to be interpreted as text (rather than categorical data), then connect Corpus to Edit Domain, and select the text variable as "Used text features"
Actual behavior
Connecting Corpus to Edit Domain results in an error:
If the error is ignored and the text variable is selected as "Used text features", Corpus will ignore it and put the default corpus book-excerpts.tab on its output.
Steps to reproduce the behavior
Open Create table with text.ows.zip and connect Corpus to Edit Domain to reproduce the behavior described above.