Closed BadriPrudhvi closed 3 years ago
Hi @BadriPrudhvi
Thank you for raising this. However, I am not quite sure if this is a bug. In your inference notebook / script, can you do
import vaex
import json
test = vaex.open('./data/titanic_test.csv')
model_path = './output/titanic_encoder.json'
# Use this instead of reading the json yourself.
test.state_load(model_path)
test
The reason here is that there is some encoding-decoding going that is specific to how vaex works, so if you just read in the json as you did it might miss to decode some stuff. You can look in the source if you are curious about the details.
In any case using df.state_load(...)
when the state is written to disk should work.
Please let us know if this helps.
Another way to convince yourself of what I said above is to compare the output of df.state_get()
from your training notebook, to the contents of the json that you will read yourself from disk.
Closing this as stale.. please re-open if needed.
Description I am having issues using the state_set method to encode the data.
`
Code in Training Notebook
import vaex
df_train = vaex.open('./data/titanic_train.csv') label_encoder = LabelEncoder(features=['sex']) df_train = label_encoder.fit_transform(df_train)
model_path = './output/titanic_encoder.json' df_train.state_write(model_path)
Code in Inference Notebook
import vaex import json
test = vaex.open('./data/titanic_test.csv')
model_path = './output/titanic_encoder.json'
with open(model_path) as f: model = json.load(f)
test.state_set(model)
test
`
Error `ERROR:MainThread:vaex:error evaluating: label_encoded_sex at rows 0-5 Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 166, in getitem raise KeyError("Unknown variables or column: %r" % (variable,)) KeyError: "Unknown variables or column: '_map(sex, map_key_set, map_choices, axis=None)'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2010, in data_type data = self.evaluate(expression, 0, 1, filtered=False, array_type=array_type, parallel=False) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6099, in _evaluate_implementation value = scope.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 156, in getitem values = self.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 112, in evaluate result = eval(expression, expression_namespace, self) File "", line 1, in
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/arrow/numpy_dispatch.py", line 136, in wrapper
result = f(*args, **kwargs)
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/functions.py", line 2484, in _map
indices = value_to_index.map_ordinal(ar) + 1
AttributeError: 'dict' object has no attribute 'map_ordinal'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 166, in getitem raise KeyError("Unknown variables or column: %r" % (variable,)) KeyError: "Unknown variables or column: '_map(sex, map_key_set, map_choices, axis=None)'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 3783, in table_part values = dict(zip(column_names, df.evaluate(column_names))) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6011, in _evaluate_implementation dtypes[expression] = dtype = df.data_type(expression).internal File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2012, in data_type data = self.evaluate(expression, 0, 1, filtered=True, array_type=array_type, parallel=False) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6099, in _evaluate_implementation value = scope.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 156, in getitem values = self.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 112, in evaluate result = eval(expression, expression_namespace, self) File "", line 1, in
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/arrow/numpy_dispatch.py", line 136, in wrapper
result = f(*args, **kwargs)
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/functions.py", line 2484, in _map
indices = value_to_index.map_ordinal(ar) + 1
AttributeError: 'dict' object has no attribute 'map_ordinal'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 166, in getitem raise KeyError("Unknown variables or column: %r" % (variable,)) KeyError: "Unknown variables or column: '_map(sex, map_key_set, map_choices, axis=None)'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2010, in data_type data = self.evaluate(expression, 0, 1, filtered=False, array_type=array_type, parallel=False) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6099, in _evaluate_implementation value = scope.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 156, in getitem values = self.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 112, in evaluate result = eval(expression, expression_namespace, self) File "", line 1, in
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/arrow/numpy_dispatch.py", line 136, in wrapper
result = f(*args, **kwargs)
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/functions.py", line 2484, in _map
indices = value_to_index.map_ordinal(ar) + 1
AttributeError: 'dict' object has no attribute 'map_ordinal'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 166, in getitem raise KeyError("Unknown variables or column: %r" % (variable,)) KeyError: "Unknown variables or column: '_map(sex, map_key_set, map_choices, axis=None)'"
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 3788, in table_part values[name] = df.evaluate(name) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6011, in _evaluate_implementation dtypes[expression] = dtype = df.data_type(expression).internal File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2012, in data_type data = self.evaluate(expression, 0, 1, filtered=True, array_type=array_type, parallel=False) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 2851, in evaluate return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, array_type=array_type, parallel=parallel, chunk_size=chunk_size) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/dataframe.py", line 6099, in _evaluate_implementation value = scope.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 106, in evaluate result = self[expression] File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 156, in getitem values = self.evaluate(expression) File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/scopes.py", line 112, in evaluate result = eval(expression, expression_namespace, self) File "", line 1, in
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/arrow/numpy_dispatch.py", line 136, in wrapper
result = f(*args, **kwargs)
File "/home/prudhvi/.local/lib/python3.7/site-packages/vaex/functions.py", line 2484, in _map
indices = value_to_index.map_ordinal(ar) + 1
AttributeError: 'dict' object has no attribute 'map_ordinal'`
Software information