explosion / srsly

🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
MIT License
435 stars 31 forks source link

TypeError: 'escape_forward_slashes' is an invalid keyword argument for this function, when loading SpaCy 3.5 model (Python 3.8) #89

Open zafercavdar opened 1 year ago

zafercavdar commented 1 year ago

Since https://github.com/explosion/srsly/issues/83 is closed, I'm opening it again.

The code to reproduce:

bash:
python3 -m spacy download it_core_news_sm
python3:
import spacy

spacy.load("it_core_news_sm")

The full trace

nlp = spacy.load(model_name, exclude=excluded_pipes)
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/__init__.py", line 54, in load
    return util.load_model(
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/util.py", line 442, in load_model
    return load_model_from_package(name, **kwargs)  # type: ignore[arg-type]
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/util.py", line 478, in load_model_from_package
    return cls.load(vocab=vocab, disable=disable, enable=enable, exclude=exclude, config=config)  # type: ignore[attr-defined]
  File "/opt/conda/default/lib/python3.8/site-packages/it_core_news_sm/__init__.py", line 10, in load
    return load_model_from_init_py(__file__, **overrides)
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/util.py", line 659, in load_model_from_init_py
    return load_model_from_path(
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/util.py", line 516, in load_model_from_path
    nlp = load_model_from_config(
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/util.py", line 564, in load_model_from_config
    nlp = lang_cls.from_config(
  File "/opt/conda/default/lib/python3.8/site-packages/spacy/language.py", line 1781, in from_config
    interpolated = filled.interpolate() if not filled.is_interpolated else filled
  File "/opt/conda/default/lib/python3.8/site-packages/confection/__init__.py", line 196, in interpolate
    return Config().from_str(self.to_str())
  File "/opt/conda/default/lib/python3.8/site-packages/confection/__init__.py", line 419, in to_str
    flattened.set(section_name, key, try_dump_json(value, node))
  File "/opt/conda/default/lib/python3.8/site-packages/confection/__init__.py", line 503, in try_dump_json
    raise ConfigValidationError(config=data, desc=err_msg) from e
confection.ConfigValidationError: 

Config validation error
Couldn't serialize config value of type <class 'NoneType'>: 'escape_forward_slashes' is an invalid keyword argument for this function. Make sure all values in your config are JSON-serializable. If you want to include Python objects, use a registered function that returns the object instead.
adrianeboyd commented 1 year ago

From the two reports, this sounds like something can go wrong with the srsly installation in conda. I'm not sure whether it's the installation itself or the srsly package.

What does conda list show for srsly?

If you create a brand new conda env and run only:

conda install -c conda-forge spacy
spacy download it_core_news_sm

can you load the model in the new env?

zafercavdar commented 1 year ago

Actually I didn't use conda too install spacy or srsly. I have a base environment and I'm using poetry to manage all packages. Poetry doesn't create a new environment, instead uses conda's base env. I'm trying with conda now. (btw, spacy 3.3.1 on python 3.7 was working fine. This problem started with python 3.8. Now it doesn't matter whether I use spacy 3.3.1 or 3.5.1, both fails with the same error message.

adrianeboyd commented 1 year ago

What does conda list show?

And if you create a new conda env and reinstall your dependencies with poetry, what does conda list show?

I don't know for sure, but my best guess is that different install steps have installed srsly and something didn't get uninstalled or upgraded cleanly or there's an incompatible mix of pip and conda packages.

If you want to keep the current base environment, you might try being really sure that you've uninstalled srsly before working with your project:

python -m pip uninstall -y srsly
python -m pip uninstall -y srsly
conda uninstall -y srsly
conda uninstall -y srsly

and then start your project install (however you install spacy) again?

zafercavdar commented 1 year ago

conda list doesn't show srsly before I install spacy but it has ujson 5.7.0 installed before spacy was installed. I uninstalled it but again, the same error. The same code works fine on MacOS env, Python3.8 in Docker but doesn't work on Debian 10 conda environment. Does this repository depend on Cython or GCC version? How do you run C code in Python environment?

adrianeboyd commented 1 year ago

Please copy and paste the exact output of conda list in the environment where this problem happens.

vishhvak commented 1 year ago

A hack that worked for me was to just edit the _json_api.py file under srsly to not have the escape_forward_slashes keyword (i.e remove it from the line with the ujson.dumps() function call). If anyone needs a quick fix that doesn't use that keyword for their functionality, feel free to edit the file as a workaround lol.

adrianeboyd commented 1 year ago

It really sounds like this might be a problem with a particular srsly package or a particular combination of pip/conda packages. You shouldn't need to hack source files in srsly for things to work, and srsly should be using its own vendored ujson, so this is all very unexpected and confusing.

Can anyone provide more details about their environment, in particular the output of conda list that shows more details about the conda packages. For example, the output for srsly and ujson from conda list could look like this:

srsly                     2.4.6           py311ha397e9f_0    conda-forge
ujson                     5.7.0           py311ha397e9f_0    conda-forge
nhm-7 commented 1 year ago

It really sounds like this might be a problem with a particular srsly package or a particular combination of pip/conda packages. You shouldn't need to hack source files in srsly for things to work, and srsly should be using its own vendored ujson, so this is all very unexpected and confusing.

Can anyone provide more details about their environment, in particular the output of conda list that shows more details about the conda packages. For example, the output for srsly and ujson from conda list could look like this:

srsly                     2.4.6           py311ha397e9f_0    conda-forge
ujson                     5.7.0           py311ha397e9f_0    conda-forge

I got the same error:

_Config validation error Couldn't serialize config value of type <class 'NoneType'>: 'escape_forward_slashes' is an invalid keyword argument for this function. Make sure all values in your config are JSON-serializable. If you want to include Python objects, use a registered function that returns the object instead. {'train': None, 'dev': None, 'vectors': None, 'inittok2vec': None}

The output of the conda list command:

srsly                     2.4.5                    pypi_0    pypi
ujson                     5.7.0                    pypi_0    pypi

I tried to install the 2.4.6 version, but it didn't work either.

Also, I have the following libraries installed:

stanza                    1.2.3                    pypi_0    pypi
spacy                     3.2.2                    pypi_0    pypi
adrianeboyd commented 1 year ago

Thanks for the info! Several of us have tried to reproduce this in a conda environment with no luck, so I'm afraid we're still not quite sure what's going on.

As far as I know the version of ujson vendored in srsly has always included escape_forward_slashes.

Can you try running the srsly test suite?

python -m pip install -r https://raw.githubusercontent.com/explosion/srsly/master/requirements.txt
python -m pytest --pyargs srsly
nhm-7 commented 1 year ago

python -m pytest --pyargs srsly

Done! this is the output after running the tests: image

adrianeboyd commented 1 year ago

Thanks! So the srsly install looks fine and the problem must be somewhere else.

What about the confection tests:

python -m pip install -r https://raw.githubusercontent.com/explosion/confection/main/requirements.txt
python -m pytest --pyargs confection

And if those pass, then the spacy tests:

python -m pip install -r https://raw.githubusercontent.com/explosion/spacy/master/requirements.txt
python -m pytest --pyargs spacy

And the conda list output for confection and spacy?

BennoKrojer commented 1 year ago

I am having the exact same issue right now! Will do the test you mentioned for confection

adrianeboyd commented 1 year ago

I hope we can get to the bottom of this. Any details about your environment and the code you're running would be appreciated!

adrianeboyd commented 1 year ago

If there's nothing confidential about your environment, maybe the output of conda env export could help us reproduce this?

nhm-7 commented 1 year ago

I hope we can get to the bottom of this. Any details about your environment and the code you're running would be appreciated!

image

I've got this error.

conda env export:

name: rec-env
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - ca-certificates=2022.10.11=h06a4308_0
  - certifi=2022.9.24=py39h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - ncurses=6.3=h5eee18b_3
  - openssl=1.1.1q=h7f8727e_0
  - pip=21.2.4=py39h06a4308_0
  - python=3.9.13=haa1d7c7_2
  - readline=8.2=h5eee18b_0
  - sqlite=3.39.3=h5082296_0
  - tk=8.6.12=h1ccaba5_0
  - tzdata=2022e=h04d1e81_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.6=h5eee18b_0
  - zlib=1.2.13=h5eee18b_0
  - pip:
    - absl-py==1.3.0
    - aiohttp==3.8.3
    - aiosignal==1.2.0
    - asttokens==2.1.0
    - async-timeout==4.0.2
    - attrs==22.1.0
    - backcall==0.2.0
    - blis==0.7.9
    - cachetools==5.2.0
    - catalogue==2.0.8
    - charset-normalizer==2.1.1
    - click==8.1.3
    - colorama==0.4.4
    - cycler==0.11.0
    - cymem==2.0.7
    - cython==0.29.32
    - cytoolz==0.12.0
    - decorator==5.1.1
    - exceptiongroup==1.1.1
    - executing==1.2.0
    - fairscale==0.4.3
    - filelock==3.8.0
    - fonttools==4.38.0
    - frozenlist==1.3.1
    - fsspec==2022.10.0
    - future==0.18.2
    - google-auth==2.14.0
    - google-auth-oauthlib==0.4.6
    - grpcio==1.50.0
    - huggingface-hub==0.10.1
    - idna==3.4
    - ijson==3.1.4
    - imageio==2.22.3
    - importlib-metadata==5.0.0
    - iniconfig==2.0.0
    - ipdb==0.13.9
    - ipython==8.6.0
    - jedi==0.18.1
    - jellyfish==0.9.0
    - jinja2==3.1.2
    - joblib==1.2.0
    - kiwisolver==1.4.4
    - langcodes==3.3.0
    - lightning-utilities==0.7.1
    - markdown==3.4.1
    - markupsafe==2.1.1
    - matplotlib==3.5.1
    - matplotlib-inline==0.1.6
    - mock==2.0.0
    - multidict==6.0.2
    - murmurhash==1.0.9
    - mypy==0.982
    - mypy-extensions==1.0.0
    - networkx==2.8.8
    - numpy==1.22.3
    - nvidia-cublas-cu11==11.10.3.66
    - nvidia-cuda-nvrtc-cu11==11.7.99
    - nvidia-cuda-runtime-cu11==11.7.99
    - nvidia-cudnn-cu11==8.5.0.96
    - oauthlib==3.2.2
    - packaging==21.3
    - pandas==1.5.1
    - parso==0.8.3
    - pathy==0.6.2
    - pbr==5.11.1
    - pexpect==4.8.0
    - pickleshare==0.7.5
    - pillow==9.1.0
    - pluggy==1.0.0
    - preshed==3.0.8
    - prompt-toolkit==3.0.32
    - protobuf==3.19.6
    - psutil==5.9.5
    - ptyprocess==0.7.0
    - pure-eval==0.2.2
    - pyasn1==0.4.8
    - pyasn1-modules==0.2.8
    - pydantic==1.8.2
    - pydeprecate==0.3.1
    - pygments==2.13.0
    - pyparsing==3.0.9
    - pyphen==0.13.0
    - pytest==7.3.1
    - pytest-timeout==1.4.2
    - python-dateutil==2.8.2
    - pytorch-lightning==1.5.2
    - pytz==2022.6
    - pywavelets==1.4.1
    - pyyaml==6.0
    - regex==2022.10.31
    - requests==2.28.1
    - requests-oauthlib==1.3.1
    - rsa==4.9
    - sacremoses==0.0.53
    - scikit-image==0.19.3
    - scikit-learn==1.1.3
    - scipy==1.8.0
    - setuptools==59.5.0
    - six==1.16.0
    - smart-open==5.2.1
    - spacy==3.2.2
    - spacy-legacy==3.0.10
    - spacy-loggers==1.0.3
    - spacy-stanza==1.0.0
    - srsly==2.4.6
    - stack-data==0.6.0
    - stanza==1.2.3
    - tensorboard==2.10.1
    - tensorboard-data-server==0.6.1
    - tensorboard-plugin-wit==1.8.1
    - textacy==0.11.0
    - thinc==8.0.17
    - threadpoolctl==3.1.0
    - tifffile==2022.10.10
    - timm==0.6.7
    - tokenizers==0.10.3
    - toml==0.10.2
    - tomli==2.0.1
    - toolz==0.12.0
    - torch==1.9.1+cu111
    - torchaudio==0.9.1
    - torchmetrics==0.10.2
    - torchvision==0.10.1+cu111
    - tqdm==4.64.1
    - traitlets==5.5.0
    - transformers==4.12.3
    - typer==0.4.2
    - typing-extensions==4.4.0
    - ujson==5.7.0
    - urllib3==1.26.12
    - wasabi==0.10.1
    - wcwidth==0.2.5
    - werkzeug==2.2.2
    - wget==3.2
    - yarl==1.8.1
    - zipp==3.10.0

Guys, I don't know why, but now it's working. Seriously, I don't know what is happening. I mean, I run only these commands:

python -m pip install -r https://raw.githubusercontent.com/explosion/confection/main/requirements.txt
python -m pytest --pyargs confection

Then, the conda export. And now I try to run my experiments and everything is working as usual.

nhm-7 commented 1 year ago

Hi again guys, it's failing again. I've just turned on my computer and run again my experiments, here's again the traceback:

image

adrianeboyd commented 1 year ago

Thanks, I could recreate the same conda env but I'm afraid I still couldn't reproduce this error.

The next time it happens, could you check if changing your current working directory makes a difference? It's a bit of stab in the dark, but maybe there's a particular local python file/module that's affecting this?

nhm-7 commented 1 year ago

Thanks, I could recreate the same conda env but I'm afraid I still couldn't reproduce this error.

The next time it happens, could you check if changing your current working directory makes a difference? It's a bit of stab in the dark, but maybe there's a particular local python file/module that's affecting this?

Thanks for your reply. How do I change my current working directory? I'm using WSL 2 and ubuntu 18.04 inside my VS Code. I'm thinking about re-create my conda environment and check if that's the problem. What do you think?

AdriianSWSY commented 1 year ago

Guys I have faced with same problem when I tried to run app inside docker. So I did some research and I think the problem is in ffmpeg. Try to run: apt -y update && apt -y install ffmpeg This command fixed issue for me. Good luck

adrianeboyd commented 1 year ago

Thanks for the note! I can't figure out how ffmpeg would affect ujson, but maybe this indicates that the problem is somehow related to system libraries with the same symbols and not problems with the python module/packages directly. Would it be possible for you to share the basic configuration from your Dockerfile?

EarningsCall commented 1 year ago

A hack that worked for me was to just edit the _json_api.py file under srsly to not have the escape_forward_slashes keyword (i.e remove it from the line with the ujson.dumps() function call). If anyone needs a quick fix that doesn't use that keyword for their functionality, feel free to edit the file as a workaround lol.

This is what worked for me as well. Here's how I did it with a few commands:

SITE_PACKAGES=$( pip show srsly | grep Location | sed 's/^Location: //g' )
JSON_API_FILE=$SITE_PACKAGES/srsly/_json_api.py
sed -i 's/, escape_forward_slashes=False//g' $JSON_API_FILE