Closed susery closed 1 year ago
This is my python lib list: Package Version
aiofiles 23.1.0 aiohttp 3.8.4 aiosignal 1.3.1 alembic 1.10.3 altair 4.2.2 anyio 3.6.2 appdirs 1.4.4 async-timeout 4.0.2 attrs 22.2.0 banal 1.0.6 bitsandbytes 0.38.1 bottle 0.12.25 bz2file 0.98 certifi 2022.12.7 charset-normalizer 2.1.1 click 8.1.3 cmake 3.25.0 contourpy 1.0.7 cpm-kernels 1.0.11 cupy-cuda12x 12.0.0 cycler 0.11.0 datasets 2.11.0 dill 0.3.6 docker-pycreds 0.4.0 entrypoints 0.4 faiss-gpu 1.7.2 fastapi 0.95.0 fastrlock 0.8.1 ffmpy 0.3.0 filelock 3.9.0 fonttools 4.39.3 frozenlist 1.3.3 fsspec 2023.4.0 gitdb 4.0.10 GitPython 3.1.31 gradio 3.25.0 gradio_client 0.1.0 greenlet 2.0.2 h11 0.14.0 httpcore 0.17.0 httpx 0.24.0 huggingface-hub 0.13.4 idna 3.4 Jinja2 3.1.2 jsonschema 4.17.3 kiwisolver 1.4.4 latex2mathml 3.75.2 linkify-it-py 2.0.0 lit 15.0.7 loguru 0.7.0 Mako 1.2.4 Markdown 3.4.3 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.1 mdit-py-plugins 0.3.3 mdtex2html 1.2.0 mdurl 0.1.2 mpmath 1.2.1 multidict 6.0.4 multiprocess 0.70.14 netifaces 0.11.0 networkx 3.0 numpy 1.24.1 orjson 3.8.10 packaging 23.1 pandas 2.0.0 pathtools 0.1.2 Pillow 9.3.0 pip 23.0.1 prompt-toolkit 3.0.38 protobuf 4.22.3 psutil 5.9.4 pyarrow 11.0.0 pydantic 1.10.7 pydub 0.25.1 pyparsing 3.0.9 pyrsistent 0.19.3 python-dateutil 2.8.2 python-multipart 0.0.6 pytz 2023.3 PyYAML 6.0 regex 2023.3.23 requests 2.28.1 responses 0.18.0 rwkv 0.7.3 semantic-version 2.10.0 sentencepiece 0.1.98 sentry-sdk 1.19.1 setproctitle 1.3.2 setuptools 65.5.0 six 1.16.0 smmap 5.0.0 sniffio 1.3.0 SQLAlchemy 1.4.47 SQLAlchemy-Utils 0.41.0 starlette 0.26.1 sympy 1.11.1 tokenizers 0.13.3 toolz 0.12.0 torch 2.0.0+cu118 torchaudio 2.0.1+cu118 torchsummary 1.5.1 torchvision 0.15.1+cu118 tqdm 4.65.0 transformers 4.27.1 triton 2.0.0 typing_extensions 4.4.0 tzdata 2023.3 uc-micro-py 1.0.1 urllib3 1.26.13 uvicorn 0.21.1 wandb 0.14.2 wcwidth 0.2.6 websockets 11.0.1 Whoosh 2.7.4 xxhash 3.2.0 yarl 1.8.2 zstandard 0.20.0
GPT-NeoXT-Chat-Base-20B is not excuting on single gpu machine
bash training/finetune_GPT-NeoXT-Chat-Base-20B.sh
Traceback (most recent call last): File "/home/xtr/git_data/OpenChatKit/training/dist_clm_train.py", line 358, in
main()
File "/home/xtr/git_data/OpenChatKit/training/dist_clm_train.py", line 275, in main
init_communicators(args)
File "/home/xtr/git_data/OpenChatKit/training/comm/comm_utils.py", line 85, in init_communicators
default_init(args)
File "/home/xtr/git_data/OpenChatKit/training/comm/comm_utils.py", line 81, in default_init
dist.init_process_group(backend='gloo', timeout=datetime.timedelta(seconds=2*60), init_method=args.dist_url, world_size=args.world_size, rank=args.rank)
File "/usr/local/python3/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 895, in init_process_group
default_pg = _new_process_group_helper(
File "/usr/local/python3/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 994, in _new_process_group_helper
backend_class = ProcessGroupGloo(backend_prefix_store, group_rank, group_size, timeout=timeout)
RuntimeError: Socket Timeout
Desktop (please complete the following information):