Open GXY2017 opened 3 years ago
I have the same issue, this seems to come from the usage of "-1" as default filling value when the dtype is object and the array is numpy:
import pyitlib.discrete_random_variable as drv
import numpy as np
drv.entropy_conditional(np.array([['A', 'B', 'R'], ['A', 'B', 'A']]))
yields
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 3495, in entropy_conditional
fill_value_Alphabet_Y))
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 4689, in _map_observations_to_integers
Fill_values = [L.transform(np.atleast_1d(f)) for f in Fill_values]
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 4689, in <listcomp>
Fill_values = [L.transform(np.atleast_1d(f)) for f in Fill_values]
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\sklearn\preprocessing\_label.py", line 277, in transform
_, y = _encode(y, uniques=self.classes_, encode=True)
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\sklearn\preprocessing\_label.py", line 122, in _encode
check_unknown=check_unknown)
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\sklearn\preprocessing\_label.py", line 51, in _encode_numpy
% str(diff))
ValueError: y contains previously unseen labels: [-1]
However when I specify a different filling value I get other problems
drv.entropy_conditional(np.array([['A', 'B', 'R'], ['A', 'B', 'A']]), fill_value='na')
yields
File "<ipython-input-7-f3179c2d30c8>", line 1, in <module>
drv.entropy_conditional(np.array([['A', 'B', 'R'], ['A', 'B', 'A']]), fill_value='na')
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 3495, in entropy_conditional
fill_value_Alphabet_Y))
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 4695, in _map_observations_to_integers
assert(np.all([A.dtype == 'int' for A in Symbol_matrices]))
AssertionError
And finally using explicit maked numpy array leads to another error:
drv.entropy_conditional(np.ma.array([['A', 'B', 'R'], ['A', 'B', 'A']]))
File "<ipython-input-9-3036d1e07c6a>", line 1, in <module>
drv.entropy_conditional(np.ma.array([['A', 'B', 'R'], ['A', 'B', 'A']]))
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 3441, in entropy_conditional
Y, fill_value_Y = _sanitise_array_input(Y, fill_value)
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\pyitlib\discrete_random_variable.py", line 4709, in _sanitise_array_input
if np.any(np.equal(X, None)) or fill_value is None:
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\numpy\ma\core.py", line 3019, in __array_finalize__
self._fill_value = _check_fill_value(self._fill_value, self.dtype)
File "C:\Miniconda3\envs\tools_py37\lib\site-packages\numpy\ma\core.py", line 480, in _check_fill_value
raise TypeError(err_msg % (fill_value, ndtype))
TypeError: Cannot convert fill_value N/A to dtype bool
Python 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)]
Before I go further, I’d like to report this error. I know this lib is not for python 3.
I tried this simple example, then got the error in title. drv.entropy(['e', 'f', 'g', 't'], base = np.exp(1)) I suspect the error comes from scipy.transform, it requires fit first then transform().
see this link: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
Update: problem in the _map_observations_to_integers(), everything needs mask, fill_value will trigger errors.