Open athenawisdoms opened 1 year ago
Hello all,
I am trying to train using alpaca weights and the provided json dataset on a RTX 3090 with the command
python medalpaca/train.py \ --model /path/to/alpaca-7b \ --data_path medical_meadow_small.json \ --output_dir 'output' \ --train_in_8bit False \ --use_lora False \ --bf16 False \ --tf32 False \ --fp16 True \ --global_batch_size 128 \ --per_device_batch_size 4
but encounter the error
NotImplementedError: Cannot copy out of meta tensor; no data!
Training works fine if use_lora is True.
use_lora
True
Anyone knows how to solve this problem? Thanks :hugs:
2023-05-10 19:44:01.524862: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-05-10 19:44:01.992637: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run python -m bitsandbytes and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues ================================================================================ bin /home/x/.local/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so /home/x/.local/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:145: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/opt/anaconda3/envs/medalpaca/lib/libcudart.so'), PosixPath('/opt/anaconda3/envs/medalpaca/lib/libcudart.so.11.0')}.. We'll flip a coin and try one of these, in order to fail forward. Either way, this might cause trouble in the future: If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env. warn(msg) CUDA SETUP: CUDA runtime path found: /opt/anaconda3/envs/medalpaca/lib/libcudart.so CUDA SETUP: Highest compute capability among GPUs detected: 8.6 CUDA SETUP: Detected CUDA version 117 CUDA SETUP: Loading binary /home/x/.local/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so... WARNING:datasets.builder:Found cached dataset json (/home/x/.cache/huggingface/datasets/json/default-f720834aba59ff0a/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1145.98it/s] WARNING:datasets.arrow_dataset:Loading cached split indices for dataset at /home/x/.cache/huggingface/datasets/json/default-f720834aba59ff0a/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4/cache-0d7eb55a8148f5f0.arrow and /home/x/.cache/huggingface/datasets/json/default-f720834aba59ff0a/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4/cache-378aa479480a6538.arrow WARNING:datasets.arrow_dataset:Loading cached processed dataset at /home/x/.cache/huggingface/datasets/json/default-f720834aba59ff0a/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4/cache-4c32985cd8b5899a.arrow ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /medAlpaca/medalpaca/train.py:286 in <module> │ │ │ │ 283 │ │ 284 │ │ 285 if __name__ == "__main__": │ │ ❱ 286 │ fire.Fire(main) │ │ 287 │ │ │ │ /home/x/.local/lib/python3.10/site-packages/fire/core.py:141 in Fire │ │ │ │ 138 │ context.update(caller_globals) │ │ 139 │ context.update(caller_locals) │ │ 140 │ │ ❱ 141 component_trace = _Fire(component, args, parsed_flag_args, context, name) │ │ 142 │ │ 143 if component_trace.HasError(): │ │ 144 │ _DisplayError(component_trace) │ │ │ │ /home/x/.local/lib/python3.10/site-packages/fire/core.py:475 in _Fire │ │ │ │ 472 │ is_class = inspect.isclass(component) │ │ 473 │ │ │ 474 │ try: │ │ ❱ 475 │ │ component, remaining_args = _CallAndUpdateTrace( │ │ 476 │ │ │ component, │ │ 477 │ │ │ remaining_args, │ │ 478 │ │ │ component_trace, │ │ │ │ /home/x/.local/lib/python3.10/site-packages/fire/core.py:691 in _CallAndUpdateTrace │ │ │ │ 688 │ loop = asyncio.get_event_loop() │ │ 689 │ component = loop.run_until_complete(fn(*varargs, **kwargs)) │ │ 690 else: │ │ ❱ 691 │ component = fn(*varargs, **kwargs) │ │ 692 │ │ 693 if treatment == 'class': │ │ 694 │ action = trace.INSTANTIATED_CLASS │ │ │ │ /medAlpaca/medalpaca/train.py:254 in main │ │ │ │ 251 │ │ **kwargs │ │ 252 │ ) │ │ 253 │ │ │ ❱ 254 │ trainer = Trainer( │ │ 255 │ │ model=model, │ │ 256 │ │ train_dataset=data["train"], │ │ 257 │ │ eval_dataset=data["test"] if val_set_size > 0 else None, │ │ │ │ /home/x/.local/lib/python3.10/site-packages/transformers/trainer.py:499 in __init__ │ │ │ │ 496 │ │ self.tokenizer = tokenizer │ │ 497 │ │ │ │ 498 │ │ if self.place_model_on_device and not getattr(model, "is_loaded_in_8bit", False) │ │ ❱ 499 │ │ │ self._move_model_to_device(model, args.device) │ │ 500 │ │ │ │ 501 │ │ # Force n_gpu to 1 to avoid DataParallel as MP will manage the GPUs │ │ 502 │ │ if self.is_model_parallel: │ │ │ │ /home/x/.local/lib/python3.10/site-packages/transformers/trainer.py:741 in │ │ _move_model_to_device │ │ │ │ 738 │ │ self.callback_handler.remove_callback(callback) │ │ 739 │ │ │ 740 │ def _move_model_to_device(self, model, device): │ │ ❱ 741 │ │ model = model.to(device) │ │ 742 │ │ # Moving a model to an XLA device disconnects the tied weights, so we have to re │ │ 743 │ │ if self.args.parallel_mode == ParallelMode.TPU and hasattr(model, "tie_weights") │ │ 744 │ │ │ model.tie_weights() │ │ │ │ /home/x/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:1886 in to │ │ │ │ 1883 │ │ │ │ " model has already been set to the correct devices and casted to the co │ │ 1884 │ │ │ ) │ │ 1885 │ │ else: │ │ ❱ 1886 │ │ │ return super().to(*args, **kwargs) │ │ 1887 │ │ │ 1888 │ def half(self, *args): │ │ 1889 │ │ # Checks if the model has been loaded in 8-bit │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1145 in to │ │ │ │ 1142 │ │ │ │ │ │ │ non_blocking, memory_format=convert_to_format) │ │ 1143 │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() else No │ │ 1144 │ │ │ │ ❱ 1145 │ │ return self._apply(convert) │ │ 1146 │ │ │ 1147 │ def register_full_backward_pre_hook( │ │ 1148 │ │ self, │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:797 in _apply │ │ │ │ 794 │ │ │ 795 │ def _apply(self, fn): │ │ 796 │ │ for module in self.children(): │ │ ❱ 797 │ │ │ module._apply(fn) │ │ 798 │ │ │ │ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │ │ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:797 in _apply │ │ │ │ 794 │ │ │ 795 │ def _apply(self, fn): │ │ 796 │ │ for module in self.children(): │ │ ❱ 797 │ │ │ module._apply(fn) │ │ 798 │ │ │ │ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │ │ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:797 in _apply │ │ │ │ 794 │ │ │ 795 │ def _apply(self, fn): │ │ 796 │ │ for module in self.children(): │ │ ❱ 797 │ │ │ module._apply(fn) │ │ 798 │ │ │ │ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │ │ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:797 in _apply │ │ │ │ 794 │ │ │ 795 │ def _apply(self, fn): │ │ 796 │ │ for module in self.children(): │ │ ❱ 797 │ │ │ module._apply(fn) │ │ 798 │ │ │ │ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │ │ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:797 in _apply │ │ │ │ 794 │ │ │ 795 │ def _apply(self, fn): │ │ 796 │ │ for module in self.children(): │ │ ❱ 797 │ │ │ module._apply(fn) │ │ 798 │ │ │ │ 799 │ │ def compute_should_use_set_data(tensor, tensor_applied): │ │ 800 │ │ │ if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:820 in _apply │ │ │ │ 817 │ │ │ # track autograd history of `param_applied`, so we have to use │ │ 818 │ │ │ # `with torch.no_grad():` │ │ 819 │ │ │ with torch.no_grad(): │ │ ❱ 820 │ │ │ │ param_applied = fn(param) │ │ 821 │ │ │ should_use_set_data = compute_should_use_set_data(param, param_applied) │ │ 822 │ │ │ if should_use_set_data: │ │ 823 │ │ │ │ param.data = param_applied │ │ │ │ /home/x/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1143 in convert │ │ │ │ 1140 │ │ │ if convert_to_format is not None and t.dim() in (4, 5): │ │ 1141 │ │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() els │ │ 1142 │ │ │ │ │ │ │ non_blocking, memory_format=convert_to_format) │ │ ❱ 1143 │ │ │ return t.to(device, dtype if t.is_floating_point() or t.is_complex() else No │ │ 1144 │ │ │ │ 1145 │ │ return self._apply(convert) │ │ 1146 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ NotImplementedError: Cannot copy out of meta tensor; no data!
absl-py==1.4.0 accelerate==0.18.0 addict==2.4.0 aenum==3.1.12 aiodns==3.0.0 aiofiles==23.1.0 aiohttp==3.8.4 aiohttp-retry==2.8.3 aiopg==1.4.0 aiosignal==1.3.1 aleph-alpha-client==3.0.0 altair==4.2.2 anthropic==0.2.6 antlr4-python3-runtime==4.9.3 anyio==3.6.2 appdirs==1.4.4 asttokens==2.2.1 astunparse==1.6.3 async-generator==1.10 async-timeout==4.0.2 attrs==23.1.0 Authlib==1.2.0 azure-cognitiveservices-speech==1.27.0 backcall==0.2.0 backoff==2.2.1 basicsr==1.4.2 bitsandbytes==0.38.1 black==23.3.0 bleach==6.0.0 blendmodes==2022 blis==0.7.9 boltons==23.0.0 boto3==1.26.109 botocore==1.29.109 cachetools==5.3.0 captcha-solver==0.1.5 catalogue==2.0.8 certifi==2022.12.7 cffi==1.15.1 cfgv==3.3.1 charset-normalizer==3.1.0 chromadb==0.3.21 clean-fid==0.1.29 click==8.1.3 click-log==0.4.0 clickhouse-connect==0.5.23 clip @ git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1 cmake==3.26.1 cohere==4.1.4 confection==0.0.4 contourpy==1.0.7 croniter==1.3.14 curl-cffi==0.5.5 cycler==0.11.0 cymem==2.0.7 dataclasses-json==0.5.7 datasets==2.12.0 decorator==5.1.1 deeplake==3.2.15 Deprecated==1.2.13 deprecation==2.1.0 dill==0.3.6 discord==2.2.3 discord.py==2.2.3 diskcache==5.6.1 distlib==0.3.6 dotty-dict==1.3.1 duckdb==0.7.1 duckduckgo-search==2.9.3 EbookLib==0.18 einops==0.4.1 elastic-transport==8.4.0 elasticsearch==8.7.0 email-validator==2.0.0.post2 en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl entrypoints==0.4 exceptiongroup==1.1.1 executing==1.2.0 exponent-server-sdk==2.0.0 facexlib==0.2.5 fairscale==0.4.13 faiss-cpu==1.7.3 fake-useragent==1.1.3 fastapi==0.94.0 ffmpy==0.3.0 filelock==3.11.0 filterpy==1.4.5 fire==0.5.0 Flask-Cors==3.0.10 flatbuffers==23.3.3 font-roboto==0.0.1 fonts==0.0.3 fonttools==4.39.3 frozenlist==1.3.3 fsspec==2023.4.0 ftfy==6.1.1 gast==0.5.3 gdown==4.7.1 gfpgan==1.3.8 gitdb==4.0.10 GitPython==3.1.30 google-api-core==2.11.0 google-api-python-client==2.84.0 google-auth==2.17.2 google-auth-httplib2==0.1.0 google-auth-oauthlib==1.0.0 google-pasta==0.2.0 google-search-results==2.4.2 googleapis-common-protos==1.59.0 GoogleBard==1.0.0 gotrue==1.0.1 gpt4free==1.0.2 gradio==3.28.3 gradio_client==0.2.1 greenlet==2.0.1 grpcio==1.53.0 grpcio-tools==1.53.0 gTTS==2.3.2 gunicorn==20.1.0 h11==0.12.0 h5py==3.8.0 hnswlib==0.7.0 html2text==2020.1.16 httpcore==0.15.0 httptools==0.5.0 httpx==0.23.3 hub==3.0.1 huggingface-hub==0.13.4 humbug==0.3.1 identify==2.5.24 idna==2.10 imageio==2.27.0 inflection==0.5.1 InstructorEmbedding==1.0.0 invoke==1.7.3 ipython==8.13.2 jaraco.context==4.3.0 jax==0.4.8 jedi==0.18.2 Jinja2==3.1.2 jmespath==1.0.1 joblib==1.2.0 jsonlines==3.1.0 jsonmerge==1.8.0 jsonschema==4.17.3 keras==2.12.0 kiwisolver==1.4.4 kornia==0.6.7 langchain==0.0.136 langcodes==3.3.0 langid==1.1.6 lark==1.1.2 lazy_loader==0.2 libclang==16.0.0 lightning-utilities==0.8.0 linkify-it-py==2.0.0 lit==16.0.0 llama-index==0.5.11 llvmlite==0.39.1 lmdb==1.4.1 loguru==0.6.0 lpips==0.1.4 manifest-ml==0.1.2 Markdown==3.4.3 markdown-it-py==2.2.0 MarkupSafe==2.1.2 marshmallow==3.19.0 marshmallow-enum==1.5.1 matplotlib==3.7.1 matplotlib-inline==0.1.6 mdit-py-plugins==0.3.3 mdurl==0.1.2 ml-dtypes==0.0.4 mpmath==1.3.0 multidict==6.0.4 multiprocess==0.70.14 murmurhash==1.0.9 mypy-extensions==1.0.0 names==0.3.0 netmiko==4.2.0 networkx==3.1 nlpcloud==1.0.40 nltk==3.8.1 nodeenv==1.7.0 nomic==1.1.6 ntc-templates==3.3.0 numba==0.56.4 numcodecs==0.11.0 numpy==1.24.3 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 omegaconf==2.2.3 open-clip-torch @ git+https://github.com/mlfoundations/open_clip.git@bb6e834e9c70d9c27d0dc3ecedeebeaeb1ffad6b openai==0.27.4 openapi-schema-pydantic==1.2.4 opencv-python==4.7.0.72 opensearch-py==2.2.0 opt-einsum==3.3.0 orjson==3.8.10 outcome==1.2.0 packaging==23.0 pandas==2.0.1 paramiko==3.1.0 parso==0.8.3 pathos==0.3.0 pathspec==0.11.1 pathy==0.10.1 peft @ git+https://github.com/huggingface/peft.git@b1059b73aab9043b118ff19b0cf96263ea86248a pexpect==4.8.0 pgvector==0.1.6 pickleshare==0.7.5 piexif==1.1.3 Pillow==9.5.0 pinecone-client==2.2.1 pip-review==1.3.0 pkginfo==1.9.6 platformdirs==3.2.0 playsound==1.2.2 playwright==1.33.0 postgrest==0.10.6 posthog==3.0.1 pox==0.3.2 ppft==1.7.6.6 pre-commit==3.3.1 preshed==3.0.8 procrastinate==0.27.0 prompt-toolkit==3.0.38 protobuf==3.20.3 psutil==5.9.5 psycopg2-binary==2.9.6 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==11.0.0 pyasn1-modules==0.2.8 pycares==4.3.0 pycparser==2.21 pycryptodome==3.17 pydantic==1.10.7 pydeck==0.8.1b0 pyDeprecate==0.3.2 pydub==0.25.1 pyee==9.0.4 PyGithub==1.58.1 Pygments==2.14.0 PyJWT==2.6.0 pyllamacpp==1.0.7 pymailtm==1.1.1 Pympler==1.0.1 pyparsing==3.0.9 PyPasser==0.0.5 pypdf==3.7.1 PyPDF2==3.0.1 pyre-extensions==0.0.30 pyrsistent==0.19.3 pyserial==3.5 PySocks==1.7.1 python-dateutil==2.8.2 python-dotenv==1.0.0 python-git==2018.2.1 python-gitlab==3.13.0 python-http-client==3.3.7 python-multipart==0.0.6 python-semantic-release==7.33.2 pytorch-lightning==1.7.6 pytz==2023.3 pytz-deprecation-shim==0.1.0.post0 PyWavelets==1.4.1 PyYAML==6.0 qdrant-client==1.1.3 random-username==1.0.2 readme-renderer==37.3 realesrgan==0.3.0 realtime==1.0.0 redis==4.5.4 regex==2023.3.23 requests==2.30.0 requests-oauthlib==1.3.1 requests-toolbelt==0.10.1 resize-right==0.0.2 responses==0.18.0 rfc3986==1.5.0 rich==13.3.3 rsa==4.9 runpod==0.9.3 s3transfer==0.6.0 scikit-image==0.19.2 scikit-learn==1.2.2 scp==0.14.5 selenium==4.9.0 semantic-version==2.10.0 semver==2.13.0 Send2Trash==1.8.2 sendgrid==6.10.0 sentence-transformers==2.2.2 sentencepiece==0.1.99 six==1.16.0 smart-open==6.3.0 smmap==5.0.0 sniffio==1.3.0 spacy==3.5.1 spacy-legacy==3.0.12 spacy-loggers==1.0.4 SpeechRecognition==3.8.1 SQLAlchemy==1.4.47 sqlitedict==2.1.0 srsly==2.4.6 stack-data==0.6.2 starkbank-ecdsa==2.2.0 starlette==0.26.1 storage3==0.5.2 streamlit==1.21.0 StrEnum==0.4.10 supabase==1.0.3 supafunc==0.2.2 sympy==1.11.1 tb-nightly==2.13.0a20230413 tenacity==8.2.2 tensorboard==2.12.1 tensorboard-data-server==0.7.0 tensorboard-plugin-wit==1.8.1 tensorflow==2.12.0 tensorflow-estimator==2.12.0 tensorflow-hub==0.13.0 tensorflow-io-gcs-filesystem==0.32.0 tensorflow-text==2.12.0 termcolor==2.2.0 textfsm==1.1.3 thinc==8.1.9 threadpoolctl==3.1.0 tifffile==2023.3.21 tiktoken==0.3.3 timm==0.6.7 tls-client==0.2.1 tokenize-rt==5.0.0 tokenizers==0.13.3 toml==0.10.2 tomli==2.0.1 tomlkit==0.11.7 toolz==0.12.0 torch==2.0.0 torchdiffeq==0.2.3 torchmetrics==0.11.4 torchsde==0.2.5 torchvision==0.15.1 tqdm==4.65.0 tqdm-loggable==0.1.3 traitlets==5.9.0 trampoline==0.1.2 transformers @ git+https://github.com/huggingface/transformers.git@006da469dd5a465f4551f4245f780e3b1e92b76c trio==0.22.0 trio-websocket==0.10.2 triton==2.0.0 tweepy==4.14.0 twine==3.8.0 TwoCaptcha==0.0.1 typer==0.7.0 typing-inspect==0.8.0 typing_extensions==4.5.0 tzdata==2023.3 tzlocal==4.3 uc-micro-py==1.0.1 ujson==5.7.0 undetected-chromedriver==3.1.7 uritemplate==4.1.1 urllib3==1.26.15 uvicorn==0.21.1 uvloop==0.17.0 validators==0.20.0 virtualenv==20.21.0 wasabi==1.1.1 watchdog==3.0.0 watchfiles==0.19.0 wcwidth==0.2.6 weaviate-client==3.15.5 webdriver-manager==3.8.6 websockets==10.4 wikipedia==1.4.0 wolframalpha==5.0.0 wonderwords==2.2.0 wrapt==1.15.0 xmltodict==0.13.0 xxhash==3.2.0 yapf==0.32.0 yarl==1.8.2 zstandard==0.21.0
Hello all,
I am trying to train using alpaca weights and the provided json dataset on a RTX 3090 with the command
but encounter the error
Training works fine if
use_lora
isTrue
.Anyone knows how to solve this problem? Thanks :hugs:
Logs
Packages