Closed bltpanda closed 1 year ago
build base image时报错,最小内存需要多大?
ImportError: /usr/local/lib/python3.8/dist-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block
完整的stack:
=> [internal] load build definition from Dockerfile.base 0.0s => => transferring dockerfile: 696B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for nvcr.io/nvidia/pytorch:22.12-py3 1.9s => [1/5] FROM nvcr.io/nvidia/pytorch:22.12-py3@sha256:09a80f272dd173c9d8 0.0s => CACHED [2/5] RUN pip config set global.index-url https://pypi.tuna.ts 0.0s => CACHED [3/5] WORKDIR /app 0.0s => CACHED [4/5] RUN echo -e 'from transformers import AutoTokenizer, Aut 0.0s => ERROR [5/5] RUN python /get-models.py && rm -rf /get-models.py 2.2s [5/5] RUN python /get-models.py && rm -rf /get-models.py: 8 1.924 Traceback (most recent call last): 8 1.924 File "/usr/local/lib/python3.8/dist-packages/sklearn/__check_build/init.py", line 44, in 8 1.924 from ._check_build import check_build # noqa 8 1.924 ImportError: /usr/local/lib/python3.8/dist-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block 8 1.924 8 1.924 During handling of the above exception, another exception occurred: 8 1.924 8 1.924 Traceback (most recent call last): 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1126, in _get_module 8 1.924 return importlib.import_module("." + module_name, self.name) 8 1.924 File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module 8 1.924 return _bootstrap._gcd_import(name[level:], package, level) 8 1.924 File "", line 1014, in _gcd_import 8 1.924 File "", line 991, in _find_and_load 8 1.924 File "", line 975, in _find_and_load_unlocked 8 1.924 File "", line 671, in _load_unlocked 8 1.924 File "", line 848, in exec_module 8 1.924 File "", line 219, in _call_with_frames_removed 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/init.py", line 61, in 8 1.924 from .document_question_answering import DocumentQuestionAnsweringPipeline 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/document_question_answering.py", line 29, in 8 1.924 from .question_answering import select_starts_ends 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/pipelines/question_answering.py", line 8, in 8 1.924 from ..data import SquadExample, SquadFeatures, squad_convert_examples_to_features 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/data/init.py", line 26, in 8 1.924 from .metrics import glue_compute_metrics, xnli_compute_metrics 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/data/metrics/init.py", line 18, in 8 1.924 if is_sklearn_available(): 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 565, in is_sklearn_available 8 1.924 return is_scipy_available() and importlib.util.find_spec("sklearn.metrics") 8 1.924 File "/usr/lib/python3.8/importlib/util.py", line 94, in find_spec 8 1.924 parent = import(parent_name, fromlist=['path']) 8 1.924 File "/usr/local/lib/python3.8/dist-packages/sklearn/init.py", line 81, in 8 1.924 from . import __check_build # noqa: F401 8 1.924 File "/usr/local/lib/python3.8/dist-packages/sklearn/__check_build/init.py", line 46, in 8 1.924 raise_build_error(e) 8 1.924 File "/usr/local/lib/python3.8/dist-packages/sklearn/__check_build/init.py", line 31, in raise_build_error 8 1.924 raise ImportError("""%s 8 1.924 ImportError: /usr/local/lib/python3.8/dist-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block 8 1.924 ___ 8 1.924 Contents of /usr/local/lib/python3.8/dist-packages/sklearn/__check_build: 8 1.924 _check_build.cpython-38-aarch64-linux-gnu.soinit.py pycache 8 1.924 setup.py 8 1.924 ___ 8 1.924 It seems that scikit-learn has not been built correctly. 8 1.924 8 1.924 If you have installed scikit-learn from source, please do not forget 8 1.924 to build the package before using it: run python setup.py install or 8 1.924 make in the source directory. 8 1.924 8 1.924 If you have used an installer, please check that it is suited for your 8 1.924 Python version, your operating system and your platform. 8 1.924 8 1.924 The above exception was the direct cause of the following exception: 8 1.924 8 1.924 Traceback (most recent call last): 8 1.924 File "/get-models.py", line 1, in 8 1.924 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline 8 1.924 File "", line 1039, in _handle_fromlist 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1116, in getattr 8 1.924 module = self._get_module(self._class_to_module[name]) 8 1.924 File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1128, in _get_module 8 1.924 raise RuntimeError( 8 1.924 RuntimeError: Failed to import transformers.pipelines because of the following error (look up to see its traceback): 8 1.924 /usr/local/lib/python3.8/dist-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block 8 1.924 ___ 8 1.924 Contents of /usr/local/lib/python3.8/dist-packages/sklearn/__check_build: 8 1.924 _check_build.cpython-38-aarch64-linux-gnu.soinit.py pycache 8 1.924 setup.py 8 1.924 ___ 8 1.924 It seems that scikit-learn has not been built correctly. 8 1.924 8 1.924 If you have installed scikit-learn from source, please do not forget 8 1.924 to build the package before using it: run python setup.py install or 8 1.924 make in the source directory. 8 1.924 8 1.924 If you have used an installer, please check that it is suited for your 8 1.924 Python version, your operating system and your platform. executor failed running [/bin/sh -c python /get-models.py && rm -rf /get-models.py]: exit code: 1
=> [internal] load build definition from Dockerfile.base 0.0s => => transferring dockerfile: 696B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for nvcr.io/nvidia/pytorch:22.12-py3 1.9s => [1/5] FROM nvcr.io/nvidia/pytorch:22.12-py3@sha256:09a80f272dd173c9d8 0.0s => CACHED [2/5] RUN pip config set global.index-url https://pypi.tuna.ts 0.0s => CACHED [3/5] WORKDIR /app 0.0s => CACHED [4/5] RUN echo -e 'from transformers import AutoTokenizer, Aut 0.0s => ERROR [5/5] RUN python /get-models.py && rm -rf /get-models.py 2.2s
[5/5] RUN python /get-models.py && rm -rf /get-models.py:
python setup.py install
make
executor failed running [/bin/sh -c python /get-models.py && rm -rf /get-models.py]: exit code: 1
在最近的测试和构建中,我使用的机器基本是 32/64+,但理论来说,如果你开启 swap 应该是没有问题的。(参考某些模型流式处理
ps:有同学在 M1/M2 不修改代码运行起来,应该不是最重要的要素,可解。
build base image时报错,最小内存需要多大?
ImportError: /usr/local/lib/python3.8/dist-packages/sklearn/__check_build/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block
完整的stack: