Closed wlw-wlw closed 6 months ago
我已尝试过重新下载模型,但仍旧报错
W&B offline. Running your script from this directory will only write metadata locally. Use wandb disabled to completely turn off W&B. You are using a model of type llama to instantiate a model of type GraphLlama. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]/root/miniconda3/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.__get__(instance, owner)() Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.72s/it] /root/miniconda3/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:492: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. warnings.warn( /root/miniconda3/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:497: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed. warnings.warn( /root/miniconda3/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:492: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. warnings.warn( /root/miniconda3/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:497: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. warnings.warn( /root/autodl-tmp/GraphGPT/graphgpt/clip_gt_arxiv_pub /root/autodl-tmp/GraphGPT/graphgpt/clip_gt_arxiv_pub 11111 loading graph pre train model CLIP( (gnn): graph_transformer( (gtLayers): Sequential( (0): GTLayer( (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True) ) (1): GTLayer( (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True) ) (2): GTLayer( (norm): LayerNorm((128,), eps=1e-06, elementwise_affine=True) ) ) (W_P): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) (inverW_P): Linear(in_features=128, out_features=128, bias=True) ) (transformer): Transformer( (resblocks): Sequential( (0): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (1): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (2): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (3): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (4): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (5): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (6): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (7): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (8): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (9): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (10): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) (11): ResidualAttentionBlock( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True) (mlp): Sequential( (c_fc): Linear(in_features=512, out_features=2048, bias=True) (gelu): QuickGELU() (c_proj): Linear(in_features=2048, out_features=512, bias=True) ) (ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) ) ) (token_embedding): Embedding(49408, 512) (ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True) ) WARNING:root:Loading data... WARNING:root:Formatting inputs...Skip in lazy mode Traceback (most recent call last): File "/root/autodl-tmp/GraphGPT/graphgpt/train/train_mem.py", line 13, in <module> train() File "/root/autodl-tmp/GraphGPT/graphgpt/train/train_graph.py", line 929, in train data_module = make_supervised_data_module(tokenizer=tokenizer, File "/root/autodl-tmp/GraphGPT/graphgpt/train/train_graph.py", line 745, in make_supervised_data_module train_dataset = dataset_cls(tokenizer=tokenizer, File "/root/autodl-tmp/GraphGPT/graphgpt/train/train_graph.py", line 556, in __init__ self.graph_data_all = torch.load(graph_data_path) File "/root/miniconda3/lib/python3.10/site-packages/torch/serialization.py", line 993, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/root/miniconda3/lib/python3.10/site-packages/torch/serialization.py", line 447, in __init__ super().__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory [2024-05-12 18:18:16,300] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 7555) of binary: /root/miniconda3/bin/python3 Traceback (most recent call last): File "/root/miniconda3/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/miniconda3/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/run.py", line 810, in <module> main() File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper return f(*args, **kwargs) File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/run.py", line 806, in main run(args) File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/run.py", line 797, in run elastic_launch( File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/root/miniconda3/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ graphgpt/train/train_mem.py FAILED ------------------------------------------------------------ Failures: <NO_OTHER_FAILURES> ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2024-05-12_18:18:16 host : autodl-container-8cee4dbe08-5999a9bc rank : 0 (local_rank: 0) exitcode : 1 (pid: 7555) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================
请问解决了吗?我也有同样的问题
我已尝试过重新下载模型,但仍旧报错