Toloka / toloka-kit

Toloka-Kit is a Python library for working with Toloka API.
https://toloka.ai
Other
202 stars 34 forks source link

[BUG] Can't get pool which has specified 'experimental group' parameter in audience filter settings #157

Closed pakkron closed 1 year ago

pakkron commented 1 year ago

Observed behavior

Hello! I'm trying to get pool info using get_pool() method (https://toloka.ai/en/docs/toloka-kit/reference/toloka.client.TolokaClient.get_pool). Script crashes with an error: IterableValidationError: While structuring typing.List[toloka.client.filter.FilterCondition] (1 sub-exception).

It happens in case when pool has specified "experimental group" parameter in audience filter settings: audience_filter_settings

If I don't specify "experimental group" parameter in pool settings then get_pool() method works properly without any crashes.

Expected behavior

get_pool() works properly without any crashes, and returns Pool object.

Python Version

3.8

Toloka-Kit Version

1.1.1

Other Packages Versions

absl-py aiohttp aiosignal anyio appdirs==1.4.4 argcomplete argon2-cffi argon2-cffi-bindings astor==0.8.1 astroid asttokens astunparse==1.6.3 async-timeout attrs audioread autopep8 Babel backcall bayesian-optimization beautifulsoup4 bleach blinker==1.4 blis bokeh boto==2.49.0 brotlipy==0.7.0 cachetools catalogue catboost category-encoders cattrs==22.1.0 certifi cffi chardet charset-normalizer click cloudpickle colorama conda==22.9.0 conda-content-trust conda-package-handling conda-token crcmod cryptography cssselect==1.1.0 cssutils cudf==0.19.2 cupy cx-Oracle cycler cymem Cython cytoolz==0.11.0 dask dask-cuda==21.8.0 dask-cudf==0.19.2 dask-glm==0.2.0 dask-labextension==5.3.0 dask-ml debugpy decorator defusedxml dill distributed docstring-parser==0.15 emails en-core-web-sm entrypoints exceptiongroup==1.1.0 executing fastavro fasteners fastjsonschema fastrlock==0.6 fasttext filelock findspark flake8 flatbuffers frozenlist fsspec funcy future==0.18.2 gast gcs-oauth2-boto-plugin gensim gitdb==4.0.9 GitPython==3.1.29 google-api-core google-apitools google-auth google-auth-oauthlib google-cloud-bigquery==1.22.0 google-cloud-core google-cloud-storage google-pasta google-reauth google-resumable-media googleapis-common-protos grpcio gsutil h11==0.14.0 h5py HeapDict httpcore==0.16.3 httplib2 httpx==0.23.3 huggingface-hub hyperopt idna imagecodecs imageio implicit importlib-metadata importlib-resources ipykernel ipyparallel ipython ipython-genutils isort jedi==0.17.0 Jinja2 joblib json5 jsonschema jupyter-client jupyter-contrib-core jupyter-contrib-nbextensions jupyter-core jupyter-highlight-selected-word jupyter-kernel-gateway==2.5.1 jupyter-latex-envs jupyter-lsp jupyter-nbextensions-configurator jupyter-resource-usage jupyter-server jupyter-server-mathjax==0.2.6 jupyter-server-proxy jupyterlab jupyterlab-execute-time jupyterlab-git==0.39.3 jupyterlab-lsp jupyterlab-pygments jupyterlab-server jupyterlab-system-monitor jupyterlab-templates==0.3.1 jupyterlab-topbar Keras Keras-Preprocessing kiwisolver langcodes lazy-object-proxy lckr-jupyterlab-variableinspector==3.0.9 librosa lightfm line-profiler==3.5.1 llvmlite==0.38.0 locket lxml==4.5.1 Markdown MarkupSafe matplotlib matplotlib-inline mccabe==0.6.1 metakernel mistune==0.8.4 mkl-fft==1.3.1 mkl-random mkl-service==2.4.0 mock monotonic==1.5 msgpack multidict multipledispatch==0.6.0 murmurhash nbclassic nbclient nbconvert nbdime==3.1.1 nbformat nest-asyncio networkx nltk notebook numba numexpr numpy nvtx oauth2client==4.1.3 oauthlib opt-einsum packaging pandas==1.2.4 pandocfilters parso partd pathy patsy==0.5.2 pbr pexpect pickleshare Pillow==9.2.0 pkgutil-resolve-name platformdirs plotly pluggy pooch portalocker premailer preshed prometheus-client prompt-toolkit protobuf==3.14.0 psutil ptyprocess pure-eval py4j pyarrow==1.0.1 pyasn1 pyasn1-modules==0.2.8 pybind11==2.9.2 pycodestyle pycosat==0.6.3 pycparser pydantic==1.8.2 pydocstyle pyflakes Pygments PyJWT pykerberos==1.2.1 pyLDAvis pylint pymongo==3.12.0 PyMySQL pyngrok pynndescent pynvml pyOpenSSL pyparsing pyrsistent PySocks pyspark python-dateutil python-jsonrpc-server python-language-server pytorch-lightning pytz pyu2f PyWavelets PyYAML==6.0 pyzmq redis regex requests requests-oauthlib==1.3.0 resampy retry-decorator rfc3986==1.5.0 rmm==0.19.0 rope rsa ruamel-yaml-conda sacremoses sasl==0.2.1 scikit-image scikit-learn scipy seaborn selenium Send2Trash sentence-transformers sentencepiece==0.1.95 shap shellingham simpervisor simplejson==3.18.1 six slicer smart-open smmap==5.0.0 sniffio snowballstemmer sortedcontainers soundfile soupsieve spacy spacy-legacy spacy-loggers spylon==0.3.0 spylon-kernel==0.4.1 srsly stack-data statsmodels tblib tenacity tensorboard tensorboard-data-server tensorboard-plugin-wit tensorflow tensorflow-estimator tensorflow-hub==0.8.0 termcolor==1.1.0 terminado testpath thinc threadpoolctl thrift==0.15.0 thrift-sasl==0.4.2 tifffile tokenizers toloka-kit==1.1.1 toml tomli tomlkit toolz torch==1.8.1 torchvision==0.2.2 tornado tqdm traitlets transformers typer typing-extensions ujson umap-learn urllib3 wasabi wcwidth webencodings==0.5.1 websocket-client Werkzeug wordcloud wrapt xgboost xlrd yadisk==1.2.17 yapf yarl zict==2.1.0 zipp

Example code

import toloka.client as toloka

toloka_token_sandbox = 'token'
toloka_env_sandbox = 'SANDBOX'

toloka_client_sandbox = toloka.TolokaClient(toloka_token_sandbox, toloka_env_sandbox)

toloka_client_sandbox.get_pool(pool_id='1451087')

Relevant log output

---------------------------------------------------------------------------
IterableValidationError                   Traceback (most recent call last)
Input In [9], in <cell line: 2>()
      1 # project_id = 133289
----> 2 toloka_client_sandbox.get_pool(pool_id='1451087')

File ~/.local/lib/python3.8/site-packages/toloka/util/_managing_headers.py:72, in add_headers.<locals>.wrapper.<locals>.wrapped(*args, **kwargs)
     69 if top_level_method_var not in ctx:
     70     stack.enter_context(set_variable(top_level_method_var, func.__name__))
---> 72 return run_in_current_context(func, *args, **kwargs)

File ~/.local/lib/python3.8/site-packages/toloka/util/_managing_headers.py:99, in run_in_current_context(func, *args, **kwargs)
     93 def run_in_current_context(func, *args, **kwargs):
     94     """Runs the function using context state from the moment of calling run_in_current_context function.
     95 
     96     Unlike Context.run supports generators, async generators and functions that return an awaitable (e.g. coroutines).
     97     """
---> 99     result = func(*args, **kwargs)
    100     if inspect.isawaitable(result):
    101         # capture context by running inside task
    102         loop = asyncio.get_event_loop()

File ~/.local/lib/python3.8/site-packages/toloka/client/__init__.py:1558, in TolokaClient.get_pool(self, pool_id)
   1545 """Gets pool data from Toloka.
   1546 
   1547 Args:
   (...)
   1555     ...
   1556 """
   1557 response = self._request('get', f'/v1/pools/{pool_id}')
-> 1558 return structure(response, Pool)

File ~/.local/lib/python3.8/site-packages/cattrs/converters.py:281, in Converter.structure(self, obj, cl)
    278 def structure(self, obj: Any, cl: Type[T]) -> T:
    279     """Convert unstructured Python data structures to structured data."""
--> 281     return self._structure_func.dispatch(cl)(obj, cl)

File ~/.local/lib/python3.8/site-packages/toloka/client/_converter.py:17, in <lambda>(data, type_)
     10 from ..util._extendable_enum import ExtendableStrEnum
     13 converter = cattr.Converter()
     15 converter.register_structure_hook_func(
     16     lambda type_: hasattr(type_, 'structure'),
---> 17     lambda data, type_: type_.structure(data)  # type: ignore
     18 )
     19 converter.register_unstructure_hook_func(  # type: ignore
     20     lambda obj: hasattr(obj, 'unstructure'),
     21     lambda obj: obj.unstructure()  # type: ignore
     22 )
     25 converter.register_structure_hook(uuid.UUID, lambda data, type_: type_(data))  # type: ignore

File ~/.local/lib/python3.8/site-packages/toloka/client/primitives/base.py:284, in BaseTolokaObject.structure(cls, data)
    282     value = data.pop(key)
    283     if field.type is not None:
--> 284         value = converter.structure(value, field.type)
    286     kwargs[field.name] = value
    288 obj = cls(**kwargs)

File ~/.local/lib/python3.8/site-packages/cattrs/converters.py:281, in Converter.structure(self, obj, cl)
    278 def structure(self, obj: Any, cl: Type[T]) -> T:
    279     """Convert unstructured Python data structures to structured data."""
--> 281     return self._structure_func.dispatch(cl)(obj, cl)

File ~/.local/lib/python3.8/site-packages/cattrs/converters.py:531, in Converter._structure_optional(self, obj, union)
    529 other = union_params[0] if union_params[1] is NoneType else union_params[1]
    530 # We can't actually have a Union of a Union, so this is safe.
--> 531 return self._structure_func.dispatch(other)(obj, other)

File ~/.local/lib/python3.8/site-packages/toloka/client/_converter.py:17, in <lambda>(data, type_)
     10 from ..util._extendable_enum import ExtendableStrEnum
     13 converter = cattr.Converter()
     15 converter.register_structure_hook_func(
     16     lambda type_: hasattr(type_, 'structure'),
---> 17     lambda data, type_: type_.structure(data)  # type: ignore
     18 )
     19 converter.register_unstructure_hook_func(  # type: ignore
     20     lambda obj: hasattr(obj, 'unstructure'),
     21     lambda obj: obj.unstructure()  # type: ignore
     22 )
     25 converter.register_structure_hook(uuid.UUID, lambda data, type_: type_(data))  # type: ignore

File ~/.local/lib/python3.8/site-packages/toloka/client/filter.py:88, in FilterCondition.structure(cls, data)
     86     return FilterOr.structure(data)
     87 if 'and' in data:
---> 88     return FilterAnd.structure(data)
     89 else:
     90     return Condition.structure(data)

File ~/.local/lib/python3.8/site-packages/toloka/client/filter.py:144, in FilterAnd.structure(cls, data)
    142 @classmethod
    143 def structure(cls, data):
--> 144     return super(FilterCondition, cls).structure(data)

File ~/.local/lib/python3.8/site-packages/toloka/client/primitives/base.py:284, in BaseTolokaObject.structure(cls, data)
    282     value = data.pop(key)
    283     if field.type is not None:
--> 284         value = converter.structure(value, field.type)
    286     kwargs[field.name] = value
    288 obj = cls(**kwargs)

File ~/.local/lib/python3.8/site-packages/cattrs/converters.py:281, in Converter.structure(self, obj, cl)
    278 def structure(self, obj: Any, cl: Type[T]) -> T:
    279     """Convert unstructured Python data structures to structured data."""
--> 281     return self._structure_func.dispatch(cl)(obj, cl)

File ~/.local/lib/python3.8/site-packages/cattrs/converters.py:470, in Converter._structure_list(self, obj, cl)
    468             ix += 1
    469     if errors:
--> 470         raise IterableValidationError(
    471             f"While structuring {cl!r}", errors, cl
    472         )
    473 else:
    474     res = [handler(e, elem_type) for e in obj]

IterableValidationError: While structuring typing.List[toloka.client.filter.FilterCondition] (1 sub-exception)
alexdrydew commented 1 year ago

Hi! This feature currently does not have official support in the API. Still, we should probably support getting such pools while keeping unknown filters unstructured. Thank you for noticing this case! We'll look into this.

Pocoder commented 1 year ago

The issue was fixed and released in version 1.1.3.

pakkron commented 1 year ago

Thanks a lot!