Closed fredrik-jansson-se closed 6 months ago
Thanks for reporting this! Can reproduce this on my end and will have a look into it.
Hi Matthias,
thanks you! Sorry I can't be of any help myself, a total ML newbie.
Hi @fredrik-jansson-se seems like for some reason the operator is not implemented for sparse tensors anymore (It was in the past). Will try to dig deeper into this later. To get you unblocked you can just flip sparse to False here: https://github.com/pytorch/serve/blob/cd52683c2e9e334a6e9eaf6985f7c8cf545f5cbe/examples/text_classification/model.py#L22
Awesome, thank you!
🐛 Describe the bug
Tryng to run the example as in the README:
python3 run_script.py
Error logs
Traceback (most recent call last): File "/Users/frja/dev/machine-learning/pytorch-org-tutorial/serve/examples/text_classification/train.py", line 143, in
train(train_dataloader, model, optimizer, criterion, epoch)
File "/Users/frja/dev/machine-learning/pytorch-org-tutorial/serve/examples/text_classification/train.py", line 49, in train
torch.nn.utils.clip_gradnorm(model.parameters(), 0.1)
File "/Users/frja/.local/share/virtualenvs/pytorch-org-tutorial-fA8BV59V/lib/python3.12/site-packages/torch/nn/utils/clip_grad.py", line 55, in clip_gradnorm
norms.extend(torch._foreach_norm(grads, norm_type))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Could not run 'aten::_foreach_norm.Scalar' with arguments from the 'SparseCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_foreach_norm.Scalar' is only available for these backends: [CPU, MPS, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31357 [kernel] MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:75 [backend fallback] Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/MetaFallbackKernel.cpp:23 [backend fallback] BackendSelect: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback] FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback] Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:324 [backend fallback] Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback] Negative: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback] ZeroTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:86 [backend fallback] AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17346 [kernel] AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback] AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback] FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback] BatchedNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback] FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback] Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback] VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback] PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback] FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback] PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback] PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback]
Traceback (most recent call last): File "/Users/frja/dev/machine-learning/pytorch-org-tutorial/serve/examples/text_classification/run_script.py", line 7, in
subprocess.run(cmd, shell=True,check=True)
File "/opt/homebrew/Cellar/python@3.12/3.12.2/Frameworks/Python.framework/Versions/3.12/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python train.py AG_NEWS --device cpu --save-model-path model.pt --dictionary source_vocab.pt' returned non-zero exit status 1.
Installation instructions
pip3 show torchserve Name: torchserve Version: 0.9.0 Summary: TorchServe is a tool for serving neural net models for inference Home-page: https://github.com/pytorch/serve.git Author: PyTorch Serving team Author-email: noreply@noreply.com License: Apache License Version 2.0 Location: /Users/frja/.local/share/virtualenvs/pytorch-org-tutorial-fA8BV59V/lib/python3.12/site-packages Requires: packaging, Pillow, psutil, wheel Required-by:
pip3 show torch Name: torch Version: 2.2.0 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /Users/frja/.local/share/virtualenvs/pytorch-org-tutorial-fA8BV59V/lib/python3.12/site-packages Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions Required-by: torchdata, torchtext, torchvision
pip3 show torchtext Name: torchtext Version: 0.16.2 Summary: Text utilities, models, transforms, and datasets for PyTorch. Home-page: https://github.com/pytorch/text Author: PyTorch Text Team Author-email: packages@pytorch.org License: BSD Location: /Users/frja/.local/share/virtualenvs/pytorch-org-tutorial-fA8BV59V/lib/python3.12/site-packages Requires: numpy, requests, torch, torchdata, tqdm Required-by:
Model Packaing
n/a
config.properties
No response
Versions
torchserve --version Removing orphan pid file. TorchServe Version is 0.9.0
Repro instructions
requirements.txt: certifi==2024.2.2; python_version >= '3.6' charset-normalizer==3.3.2; python_full_version >= '3.7.0' enum-compat==0.0.3 filelock==3.13.1; python_version >= '3.8' fsspec==2024.2.0; python_version >= '3.8' idna==3.6; python_version >= '3.5' jinja2==3.1.3; python_version >= '3.7' markupsafe==2.1.5; python_version >= '3.7' mpmath==1.3.0 networkx==3.2.1; python_version >= '3.9' numpy==1.26.4; python_version >= '3.9' packaging==23.2; python_version >= '3.7' pillow==10.2.0; python_version >= '3.8' portalocker==2.8.2; python_version >= '3.8' psutil==5.9.8; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3, 3.4, 3.5' pyaml==23.12.0; python_version >= '3.8' pyyaml==6.0.1; python_version >= '3.6' requests==2.31.0; python_version >= '3.7' sympy==1.12; python_version >= '3.8' torch==2.2.0; python_full_version >= '3.8.0' torch-model-archiver==0.9.0 torch-workflow-archiver==0.2.11 torchdata==0.7.1; python_version >= '3.8' torchserve==0.9.0 torchtext==0.16.2; python_version >= '3.8' torchvision==0.17.0; python_version >= '3.8' tqdm==4.66.2; python_version >= '3.7' typing-extensions==4.9.0; python_version >= '3.8' urllib3==2.2.0; python_version >= '3.8' wheel==0.42.0; python_version >= '3.7'
cd serve/examples/text_classification python3 run_script.py
Possible Solution
No response