Open wkpark opened 9 months ago
The test_nvidia_transform[dims=2-transpose=F-orderOut=col32-orderA=row-int8-dim3=3-dim2=224-dim1=152]
tests may well be known failures (that should be skips); see how the test has early exits at the start (that were touched recently in #1000).
As for
tests\test_optim.py: system crash after test done.
this is probably the same thing that manifested as dmesg
errors for me on WSL. You might want to try running with -k "not (benchmark or slow)"
to skip some heavy tests to see if you get the suite to pass more :)
Thanks a lot for the report @wkpark !
Out of curiosity would you mind also running the transformers integration tests? 🙏
First git clone: https://github.com/huggingface/transformers.git
Then run: RUN_SLOW=1 pytest tests/quantization/bnb/test_4bit.py
set RUN_SLOW=1
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_4bit.py
====================================================================================================== test session starts ======================================================================================================
platform win32 -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
rootdir: D:\src\transformers
configfile: pyproject.toml
plugins: anyio-3.7.1, hydra-core-1.3.2, hypothesis-6.93.0, xdist-3.5.0
collected 39 items
tests\quantization\bnb\test_4bit.py ......F.FF....s...FF.FF..FFFFFFFFFFFFFF [100%]
Thanks a lot for running the tests !
Hmmm, I think you might did not installed transformers from source, can you try to build transformers from source ( pip install -e ".[dev]"
) and re-run the tests? 🙏
(venv) D:\src\transformers>pip show transformers
WARNING: Ignoring invalid distribution -afetensors (f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages)
WARNING: Ignoring invalid distribution -itsandbytes (f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages)
WARNING: Ignoring invalid distribution -orch (f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages)
WARNING: Ignoring invalid distribution -rotobuf (f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages)
Name: transformers
Version: 4.38.0.dev0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages
Editable project location: D:\src\transformers
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: groundingdino, image-reward, lycoris-lora
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_4bit.py
====================================================================================================== test session starts ======================================================================================================
platform win32 -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
rootdir: D:\src\transformers
configfile: pyproject.toml
plugins: anyio-3.7.1, hydra-core-1.3.2, hypothesis-6.93.0, xdist-3.5.0
collected 39 items
tests\quantization\bnb\test_4bit.py ..............s...........FFFFFFFFFFF.. [100%]
Interesting, the great news is that only the serialization tests are failing, can you try to update accelerate
? pip install -U accelerate
this might fix the failing tests
Can you in addition to that run the 8bit tests? 🙏 RUN_SLOW=1 pytest tests/quantization/bnb/test_mixed_int8.py
after updating accelerate, test_serialization
test was successfully passed!
(venv) >pip show accelerate
Name: accelerate
Version: 0.26.1
Summary: Accelerate
Home-page: https://github.com/huggingface/accelerate
Author: The HuggingFace team
Author-email: sylvain@huggingface.co
License: Apache
Location: f:\webui\webui\stable-diffusion-webui\venv\lib\site-packages
Requires: huggingface-hub, numpy, packaging, psutil, pyyaml, safetensors, torch
Required-by: image-reward
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_4bit.py -k "test_serialization"
====================================================================================================== test session starts ======================================================================================================
platform win32 -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
rootdir: D:\src\transformers
configfile: pyproject.toml
plugins: anyio-3.7.1, hydra-core-1.3.2, hypothesis-6.93.0, xdist-3.5.0
collected 39 items / 35 deselected / 4 selected
tests\quantization\bnb\test_4bit.py .... [100%]
======================================================================================================= warnings summary ========================================================================================================
F:\webui\webui\stable-diffusion-webui\venv\lib\site-packages\_pytest\config\__init__.py:1373
F:\webui\webui\stable-diffusion-webui\venv\lib\site-packages\_pytest\config\__init__.py:1373: PytestConfigWarning: Unknown config option: doctest_glob
self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")
tests/quantization/bnb/test_4bit.py::BaseSerializationTest::test_serialization
F:\webui\webui\stable-diffusion-webui\venv\lib\site-packages\torch\_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
tests/quantization/bnb/test_4bit.py::BaseSerializationTest::test_serialization
tests/quantization/bnb/test_4bit.py::ExtendedSerializationTest::test_serialization
tests/quantization/bnb/test_4bit.py::BloomSerializationTest::test_serialization
tests/quantization/bnb/test_4bit.py::GPTSerializationTest::test_serialization
D:\src\transformers\src\transformers\quantizers\auto.py:147: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be prevail.
warnings.warn(warning_msg)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================================= 4 passed, 35 deselected, 6 warnings in 46.51s =========================================================================================
test again test_4bit.py
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_4bit.py
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_4bit.py
====================================================================================================== test session starts ======================================================================================================
platform win32 -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
rootdir: D:\src\transformers
configfile: pyproject.toml
plugins: anyio-3.7.1, hydra-core-1.3.2, hypothesis-6.93.0, xdist-3.5.0
collected 39 items
tests\quantization\bnb\test_4bit.py ..............s........................ [100%]
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================== 38 passed, 1 skipped, 33 warnings in 441.37s (0:07:21) =====================================================================================
all tests passed for 4bit!!😎
short test summary info for mixed_int8 test
(venv) D:\src\transformers>python -m pytest tests\quantization\bnb\test_mixed_int8.py
====================================================================================================== test session starts ======================================================================================================
platform win32 -- Python 3.10.11, pytest-7.4.2, pluggy-1.3.0
rootdir: D:\src\transformers
configfile: pyproject.toml
plugins: anyio-3.7.1, hydra-core-1.3.2, hypothesis-6.93.0, xdist-3.5.0
collected 43 items
tests\quantization\bnb\test_mixed_int8.py .....................sssss...FF..FFFF......
(snip)...
=========================================================================================================== FAILURES ============================================================================================================
____________________________________________________________________________________________ MixedInt8GPT2Test.test_generate_quality ____________________________________________________________________________________________
self = <bnb.test_mixed_int8.MixedInt8GPT2Test testMethod=test_generate_quality>
def test_generate_quality(self):
r"""
Test the generation quality of the quantized model and see that we are matching the expected output.
Given that we are operating on small numbers + the testing model is relatively small, we might not get
the same output across GPUs. So we'll generate few tokens (5-10) and check their output.
"""
encoded_input = self.tokenizer(self.input_text, return_tensors="pt")
output_sequences = self.model_8bit.generate(input_ids=encoded_input["input_ids"].to(0), max_new_tokens=10)
> self.assertIn(self.tokenizer.decode(output_sequences[0], skip_special_tokens=True), self.EXPECTED_OUTPUTS)
E AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
tests\quantization\bnb\test_mixed_int8.py:264: AssertionError
----------------------------------------------------------------------------------------------------- Captured stderr call ------------------------------------------------------------------------------------------------------
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
(snip)...
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================================== short test summary info ====================================================================================================
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_generate_quality - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_generate_quality_config - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_int8_from_pretrained - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_int8_serialization - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_int8_serialization_regression - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
FAILED tests/quantization/bnb/test_mixed_int8.py::MixedInt8GPT2Test::test_int8_serialization_sharded - AssertionError: 'Hello my name is John Doe, and I am a member of the' not found in {"Hello my name is John Doe, and I'm a big fan of", "Hello my name is John Doe, and I'm a fan of the"}
=============================================================================== 6 failed, 32 passed, 5 skipped, 19 warnings in 720.00s (0:11:59) ================================================================================
AMAZING @wkpark ! 🎉 For the 8bit tests the quality tests are expected to not pass, don't worry about them
Can you in addition to that run the 8bit tests? 🙏
RUN_SLOW=1 pytest tests/quantization/bnb/test_mixed_int8.py
For the record, RUN_SLOW
is not a thing for this repository – I added Pytest marks for that. Slow tests are run by default, but you can opt-out via -k "not slow"
.
@akx thanks ! I meant for the transformers repository not for the slow tests in bnb repository (I think you meant here the slow tests for bnb no?)
I was able to build with CUDA 12.0 and run the tests on Windows.
Hardware: CPU: i7-12700H GPU: RTX 3060 Mobile
Software: OS: Windows 11 MSVC: 19.38.33134 (VC++ Toolset 14.38.33130) CMake: 3.27.2-msvc1 CUDA Toolkit: 12.0.140 NVIDIA Driver: 546.65 Python: 3.11.6 PyTorch: 2.2.0+cu121 Transformers: 4.37.2
Build configuration:
CMAKE_BUILD_TYPE=Release
BUILD_CUDA=ON
NO_CUBLASLT=OFF
CUDA_VERSION=120
COMPUTE_CAPABILITY=50;52;53;60;61;62;70;72;75;80;86;87;89;90
PTXAS_VERBOSE=OFF
I've observed the same crash on the tests in test_optim. When I skip those tests and the benchmark/slow ones, here is my result:
2901 passed
24 failed
- tests/test_functional.py:533 test_vector_quant[dim3=56-dim2=80-dim1=12]
- tests/test_functional.py:2155 test_gemv_4bit[uint8-bf16-fc2-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[uint8-bf16-fc2-nf4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[uint8-bf16-fc2-fp4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[uint8-bf16-fc2-fp4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[fp16-fp16-fc2-fp4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[fp16-bf16-fc2-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp16-bf16-fc2-nf4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[fp16-bf16-fc2-fp4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp16-bf16-fc2-fp4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-fp16-fc2-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-fp16-fc2-fp4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-fp16-attn-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-bf16-fc2-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-bf16-fc2-nf4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-bf16-fc2-fp4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-bf16-fc2-fp4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[bf16-bf16-attn-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-fp16-fc2-nf4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-fp16-fc2-fp4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-bf16-fc2-nf4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-bf16-fc2-nf4-DQ_False]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-bf16-fc2-fp4-DQ_True]
- tests/test_functional.py:2155 test_gemv_4bit[fp32-bf16-fc2-fp4-DQ_False]
9 skipped
25 deselected
As for the optimizer tests, these complete with 3 failures and 2 skips prior to a crash:
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=momentum] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=rmsprop] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=paged_adamw] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=paged_adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp32-opt=paged_lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=momentum] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=rmsprop] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=paged_adamw] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=paged_adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-fp16-opt=paged_lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=momentum] s
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=rmsprop] s
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=paged_adamw] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=paged_adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=32-dim1=1024-bf16-opt=paged_lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=momentum] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=rmsprop] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=paged_adamw] ⨯
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=paged_adam] ⨯
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=lion] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp32-opt=paged_lion] ⨯
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp16-opt=adam] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp16-opt=momentum] ✓
tests\test_optim.py::test_optimizer32bit[dim2=1024-dim1=1024-fp16-opt=rmsprop] ✓
Are those tests only failing due to slight deviations from the tolerances? If this is the case, then this is expected due to the unfortunately quite flaky tests (something we'll work on fixing soon).
In that case, we could close this issue and be super happy that this whole Windows journey went so well! Thanks again to anyone involved, especially @wkpark and @matthewdouglas ❤️
@Titus-von-Koeller Yes, the failures were related to some tolerances and the stochastic nature of some of the tests. I get similar results on my Linux machine.
I do think the crash on the 32bit optimizer tests was related to the 6GB vRAM that I have on my Windows machine. It seems those tests need closer to ~12GB to run. Stabilizing these tests is a good separate issue across platform, but I think we're good closing this one.
System Info
OS: Windows10 Python: 3.10 Torch: 2.1.2 GPU: 4060 TI 16GB Cuda: 11.8 bitsandbytes: latest snapshot
Reproduction
this is just a report for current windows support
Expected behavior
This is a test result of
tests\tests_functional.py
: 31 failed, 592 passed, 9 skipped in 767.86s (0:12:47)tests\test_autograd.py
: 2240 passed, 704 warnings in 119.18s (0:01:59)tests\test_linear4bit.py
: 32 passed in 2.90stests\test_linear8bitlt.py
: 18 passed in 14.60stests\test_optim.py
: system crash after test done. (about 19 error, collected 177 items)Details
~~~bash ===================================================================================== FAILURES ====================================================================================== _________________________________________ test_nvidia_transform[dims=2-transpose=F-orderOut=col32-orderA=row-int8-dim3=3-dim2=224-dim1=152] _________________________________________ dim1 = 152, dim2 = 224, dim3 = 3, dims = 2, dtype = torch.int8, orderA = 'row', orderOut = 'col32', transpose = False @pytest.mark.parametrize("dim1", get_test_dims(2, 256, n=2), ids=id_formatter("dim1")) @pytest.mark.parametrize("dim2", get_test_dims(2, 256, n=2), ids=id_formatter("dim2")) @pytest.mark.parametrize("dim3", get_test_dims(2, 256, n=2), ids=id_formatter("dim3")) @pytest.mark.parametrize("dtype", [torch.int8, torch.int32], ids=describe_dtype) @pytest.mark.parametrize("orderA", ["row"], ids=id_formatter("orderA")) @pytest.mark.parametrize("orderOut", ["col", "row", "col32"], ids=id_formatter("orderOut")) @pytest.mark.parametrize("transpose", [False], ids=id_formatter("transpose")) @pytest.mark.parametrize("dims", [2, 3], ids=id_formatter("dims")) def test_nvidia_transform(dim1, dim2, dim3, dims, dtype, orderA, orderOut, transpose): if dims == 3 and orderOut != "col32": return if dtype == torch.int32 and orderOut != "col32": return try: func = F.get_transform_func(dtype, orderA, orderOut, transpose) except ValueError as ve: pytest.skip(str(ve)) # skip if not supported if dims == 2: A = torch.randint(-128, 127, size=(dim1, dim2), device="cuda").to(dtype) elif dims == 3: A = torch.randint(-128, 127, size=(dim1, dim2, dim3), device="cuda").to( dtype ) out, S = F.nvidia_transform(A, to_order=orderOut) if orderOut == "row": torch.testing.assert_close(A.flatten(), out.flatten()) elif orderOut == "col": torch.testing.assert_close(A.t().flatten(), out.flatten()) elif orderOut == "col32": if dims == 2: n = A.shape[0] * (A.shape[1] + (32 - (A.shape[1] % 32))) elif dims == 3: n = ( A.shape[0] * A.shape[1] * (A.shape[2] + (32 - (A.shape[2] % 32))) ) > assert out.numel() == n E AssertionError: assert 34048 == 38912 E + where 34048 =test_nvidia_transform
: 8 failed, 88 passed, 536 deselected in 11.29stest_gemv_4bit
: 23 failed, 169 passed, 440 deselected in 615.68s (0:10:15)