pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.51k stars 810 forks source link

`_torchtext` was not found in nightly release for windows 3.7 #1680

Closed ejguan closed 2 years ago

ejguan commented 2 years ago

🐛 Bug

Describe the bug

See the CI on TorchData https://github.com/pytorch/data/runs/6011121045?check_suite_focus=true

We installed TorchText with version torchtext-0.13.0.dev20220413 for Python 3.7 on Windows. The test failure is: ModuleNotFoundError: No module named 'torchtext._torchtext'

To Reproduce Steps to reproduce the behavior:

Add a test for nightly release?

Expected behavior

No Error as the other platforms

Environment

CI certifi-2021.10.8 charset-normalizer-2.0.12 colorama-0.4.4 idna-3.3 numpy-1.21.6 requests-2.27.1 torch-1.12.0.dev20220413+cpu torchtext-0.13.0.dev20220413 tqdm-4.64.0 typing-extensions-4.1.1 urllib3-1.26.9

Additional context Add any other context about the problem here.

mthrok commented 2 years ago

Corresponding CI failure on TorchText repo

Smoke test failure: https://app.circleci.com/pipelines/github/pytorch/text/5246/workflows/40ecc516-0f48-47e4-8f5b-fb5417701699/jobs/176575

Build: https://app.circleci.com/pipelines/github/pytorch/text/5246/workflows/40ecc516-0f48-47e4-8f5b-fb5417701699/jobs/176557

The final link command looks strange, it looks as if the variable name _torchtet is missing.

``` "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\circleci\project\third_party\build\lib /LIBPATH:C:\Users\circleci\project\third_party\build\lib64 /LIBPATH:C:\tools\miniconda3\envs\env3.7\lib\site-packages\torch\lib /LIBPATH:C:\tools\miniconda3\envs\env3.7\libs /LIBPATH:C:\tools\miniconda3\envs\env3.7 /LIBPATH:C:\tools\miniconda3\envs\env3.7\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22000.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22000.0\um\x64" sentencepiece_train.lib sentencepiece.lib re2.lib double-conversion.lib c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit__torchtext C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\clip_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\common.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\gpt2_bpe_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\regex.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\regex_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\register_pybindings.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\register_torchbindings.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\sentencepiece.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\vectors.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\vocab.obj /OUT:build\lib.win-amd64-3.7\torchtext\pyd /IMPLIB:C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\pyd.lib ```

And setup tools is not picking up _torchtext

``` creating torchtext.egg-info writing torchtext.egg-info\PKG-INFO writing dependency_links to torchtext.egg-info\dependency_links.txt writing requirements to torchtext.egg-info\requires.txt writing top-level names to torchtext.egg-info\top_level.txt writing manifest file 'torchtext.egg-info\SOURCES.txt' reading manifest file 'torchtext.egg-info\SOURCES.txt' adding license file 'LICENSE' writing manifest file 'torchtext.egg-info\SOURCES.txt' Copying torchtext.egg-info to build\bdist.win-amd64\wheel\.\torchtext-0.13.0.dev20220413-py3.7.egg-info running install_scripts adding license file "LICENSE" (matched pattern "LICEN[CS]E*") creating build\bdist.win-amd64\wheel\torchtext-0.13.0.dev20220413.dist-info\WHEEL creating 'dist\torchtext-0.13.0.dev20220413-cp37-cp37m-win_amd64.whl' and adding 'build\bdist.win-amd64\wheel' to it adding 'torchtext/__init__.py' adding 'torchtext/_download_hooks.py' adding 'torchtext/_extension.py' adding 'torchtext/functional.py' adding 'torchtext/pyd' adding 'torchtext/transforms.py' adding 'torchtext/utils.py' adding 'torchtext/version.py' adding 'torchtext/_internal/__init__.py' adding 'torchtext/_internal/module_utils.py' adding 'torchtext/data/__init__.py' adding 'torchtext/data/datasets_utils.py' adding 'torchtext/data/functional.py' adding 'torchtext/data/metrics.py' adding 'torchtext/data/utils.py' adding 'torchtext/datasets/__init__.py' adding 'torchtext/datasets/ag_news.py' adding 'torchtext/datasets/amazonreviewfull.py' adding 'torchtext/datasets/amazonreviewpolarity.py' adding 'torchtext/datasets/cc100.py' adding 'torchtext/datasets/conll2000chunking.py' adding 'torchtext/datasets/dbpedia.py' adding 'torchtext/datasets/enwik9.py' adding 'torchtext/datasets/imdb.py' adding 'torchtext/datasets/iwslt2016.py' adding 'torchtext/datasets/iwslt2017.py' adding 'torchtext/datasets/multi30k.py' adding 'torchtext/datasets/penntreebank.py' adding 'torchtext/datasets/sogounews.py' adding 'torchtext/datasets/squad1.py' adding 'torchtext/datasets/squad2.py' adding 'torchtext/datasets/sst2.py' adding 'torchtext/datasets/udpos.py' adding 'torchtext/datasets/wikitext103.py' adding 'torchtext/datasets/wikitext2.py' adding 'torchtext/datasets/yahooanswers.py' adding 'torchtext/datasets/yelpreviewfull.py' adding 'torchtext/datasets/yelpreviewpolarity.py' adding 'torchtext/experimental/__init__.py' adding 'torchtext/experimental/transforms.py' adding 'torchtext/experimental/vectors.py' adding 'torchtext/experimental/vocab_factory.py' adding 'torchtext/models/__init__.py' adding 'torchtext/models/roberta/__init__.py' adding 'torchtext/models/roberta/bundler.py' adding 'torchtext/models/roberta/model.py' adding 'torchtext/models/roberta/modules.py' adding 'torchtext/nn/__init__.py' adding 'torchtext/nn/modules/__init__.py' adding 'torchtext/nn/modules/multiheadattention.py' adding 'torchtext/vocab/__init__.py' adding 'torchtext/vocab/vectors.py' adding 'torchtext/vocab/vocab.py' adding 'torchtext/vocab/vocab_factory.py' adding 'torchtext-0.13.0.dev20220413.dist-info/LICENSE' adding 'torchtext-0.13.0.dev20220413.dist-info/METADATA' adding 'torchtext-0.13.0.dev20220413.dist-info/WHEEL' adding 'torchtext-0.13.0.dev20220413.dist-info/top_level.txt' adding 'torchtext-0.13.0.dev20220413.dist-info/RECORD' ```

I expect setuptools to fail at this point, but it somehow did not fail.


However it is more strange that this only happens on 3.7. It seems to be happening on Conda as well.

https://app.circleci.com/pipelines/github/pytorch/text/5246/workflows/40ecc516-0f48-47e4-8f5b-fb5417701699/jobs/176601

mthrok commented 2 years ago

Maybe the last warning is the answer.

C:\tools\miniconda3\envs\env3.7\lib\site-packages\wheel\bdist_wheel.py:87: RuntimeWarning: Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect
  sys.version_info < (3, 8))) \

CI could be building 3.8 extension with Python 3.7?

ejguan commented 2 years ago

Not an expert on this topic. But looking at the build log, it seems _torchtext.pyd somehow changed to pyd.

See:

"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\circleci\project\third_party\build\lib /LIBPATH:C:\Users\circleci\project\third_party\build\lib64 /LIBPATH:C:\tools\miniconda3\envs\env3.7\lib\site-packages\torch\lib /LIBPATH:C:\tools\miniconda3\envs\env3.7\libs /LIBPATH:C:\tools\miniconda3\envs\env3.7 /LIBPATH:C:\tools\miniconda3\envs\env3.7\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22000.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22000.0\um\x64" sentencepiece_train.lib sentencepiece.lib re2.lib double-conversion.lib c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit__torchtext C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\clip_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\common.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\gpt2_bpe_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\regex.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\regex_tokenizer.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\register_pybindings.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\register_torchbindings.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\sentencepiece.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\vectors.obj C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\vocab.obj /OUT:build\lib.win-amd64-3.7\torchtext\pyd /IMPLIB:C:\Users\circleci\project\build\temp.win-amd64-3.7\Release\Users\circleci\project\torchtext\csrc\pyd.lib

The OUT is build\lib.win-amd64-3.7\torchtext\pyd

And copying build\lib.win-amd64-3.7\torchtext\pyd -> build\bdist.win-amd64\wheel\.\torchtext

mthrok commented 2 years ago

The extension name is processed by PyTorch's CppExtension at the moment, so it could be something on PyTorch core side.

https://github.com/pytorch/text/blob/main/build_tools/setup_helpers/extension.py#L168

However, CppExtension is a wrapper around setuptools. So the peculiarity of failing only on Windows 3.7 could be happening there, in which case there aren't much thing torchtext can fix. Also @Nayef211 is migrating the build system to CMake, so it might not be relevant for a long time. I'd wait to see what happens tomorrow, unless you are making changes that is sensitive to OS. (In that case, I do not have a suggestion for the next step.)

Nayef211 commented 2 years ago

Yeah I don't believe we've made any changes to torchtext's build system recently to warrant this failure so I also suspect it might be related to a change coming from PyTorch. The revamped build system using CMake is almost ready in https://github.com/pytorch/text/pull/1673. We're just waiting to resolve all Windows build failures before we land.

mthrok commented 2 years ago

Note: I am working on the branch from @Nayef211 's cmake build. I am observing the similar phenomenon but the CI job is failing properly.

https://app.circleci.com/pipelines/github/pytorch/text/5250/workflows/b27e96e4-f791-4d84-9e37-55ec181ed2af/jobs/176668

error: can't copy 'build\lib.win-amd64-3.7\pyd': doesn't exist or not a regular file

``` [106/107] cmd.exe /C "cd /D C:\Users\circleci\project\build\temp.win-amd64-3.7\Release && C:\Users\circleci\project\env\Lib\site-packages\cmake\data\bin\cmake.exe -P cmake_install.cmake" -- Install configuration: "Release" -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/re2/filtered_re2.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/re2/re2.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/re2/set.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/re2/stringpiece.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/re2.lib -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/re2/re2Config.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/re2/re2Config-release.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/double-conversion.lib -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/bignum.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/cached-powers.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/diy-fp.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/double-conversion.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/double-to-string.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/fast-dtoa.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/fixed-dtoa.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/ieee.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/string-to-double.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/strtod.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/double-conversion/utils.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/double-conversion/double-conversionConfig.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/double-conversion/double-conversionConfigVersion.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/double-conversion/double-conversionTargets.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/cmake/double-conversion/double-conversionTargets-release.cmake -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/sentencepiece.lib -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/lib/sentencepiece_train.lib -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/bin/spm_encode.exe -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/bin/spm_decode.exe -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/bin/spm_normalize.exe -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/bin/spm_train.exe -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/bin/spm_export_vocab.exe -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/sentencepiece_trainer.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/include/sentencepiece_processor.h -- Installing: C:/Users/circleci/project/build/lib.win-amd64-3.7/torchtext/./_torchtext.pyd C:\Users\circleci\project\env\lib\site-packages\setuptools\dist.py:534: UserWarning: The version specified ('') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details. "details." % version C:\Users\circleci\project\env\lib\site-packages\setuptools\command\easy_install.py:147: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. EasyInstallDeprecationWarning, C:\Users\circleci\project\env\lib\site-packages\setuptools\command\install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. setuptools.SetuptoolsDeprecationWarning, error: can't copy 'build\lib.win-amd64-3.7\pyd': doesn't exist or not a regular file Error in atexit._run_exitfuncs: Traceback (most recent call last): File "C:\Users\circleci\project\env\lib\site-packages\colorama\ansitowin32.py", line 59, in closed return stream.closed ValueError: underlying buffer has been detached ```
mthrok commented 2 years ago

@Nayef211 I think this can be closed now.