The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1
Error only if i change gradient_checkpointing into True, and that waste lots of memory
environment as follows:
cuda==11.7(limited by machine)
python==3.9
pytorch==2.0.1(limited by cuda)
deepspeed==0.14.2
transformers==4.41.1
lightning==2.3.0 (lightning==2.4.0 need torch<4.0,>=2.1.0)
wheel==0.44.0
flash-attn==2.6.3 (False,limited by machine)
fbgemm-gpu==0.5.0
sentencepiece==0.2.0
pandas==2.2.3
colorlog==6.9.0
tensorboardX==2.6.2.2
tensorflow_cpu==2.8.0
colorama==0.4.6
torch_geometric==2.5.3
scikit-learn==1.5.2
protobuf==3.20
torchrun --master_port=12345 --node_rank=0 --nproc_per_node=1 --nnodes=1 run.py --config_file overall/LLM_deepspeed.yaml HLLM/HLLM.yaml --MAX_ITEM_LIST_LENGTH 3 --epochs 5 --optim_args.learning_rate 1e-4 --MAX_TEXT_LENGTH 3 --train_batch_size 2
[2024-10-31 18:29:55,303] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[93m [WARNING] [0m async_io requires the dev libaio .so object and headers but these were not found.
[93m [WARNING] [0m async_io: please install the libaio-devel package with yum
[93m [WARNING] [0m If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[93m [WARNING] [0m Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[93m [WARNING] [0m NVIDIA Inference is only supported on Ampere and newer architectures
[93m [WARNING] [0m sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
[93m [WARNING] [0m using untested triton version (2.0.0), only 1.0.0 is known to be compatible
31 Oct 18:30 INFO Update text_path to /data/home/xconnorwang/HLLM/information/Pixel200K.csv
31 Oct 18:30 INFO Loading <class 'REC.data.dataload.Data'> from scratch with self.data_split = None.
31 Oct 18:30 INFO Interaction feature loaded successfully from [../dataset/Pixel200K.csv].
31 Oct 18:30 INFO self.user_num = 200001 self.item_num = 96283
31 Oct 18:30 INFO self.inter_feat['item_id'].isna().any() = False self.inter_feat['user_id'].isna().any() = False
31 Oct 18:30 INFO build Pixel200K dataload
31 Oct 18:30 INFO Use random sample True for mask id
31 Oct 18:30 INFO Text path: /data/home/xconnorwang/HLLM/information/Pixel200K.csv
31 Oct 18:30 INFO Text keys: ['title', 'tag', 'description']
31 Oct 18:30 INFO Item prompt: Compress the following sentence into embedding:
31 Oct 18:30 INFO Text Item num: 96281
31 Oct 18:30 INFO [Training]: train_batch_size = [2]
31 Oct 18:30 INFO [Evaluation]: eval_batch_size = [3]
/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 11 worker processes in total. Our suggested max number of worker in current system is 10, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
len(train_loader) = 408043
31 Oct 18:30 INFO create item llm
31 Oct 18:30 INFO create LLM ../item_pretrain
31 Oct 18:30 INFO hf_config: LlamaConfig {
"_name_or_path": "../item_pretrain",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 0,
"eos_token_id": 0,
"hidden_act": "silu",
"hidden_size": 576,
"initializer_range": 0.02,
"intermediate_size": 1536,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 9,
"num_hidden_layers": 30,
"num_key_value_heads": 3,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.1",
"use_cache": true,
"vocab_size": 49152
}
31 Oct 18:30 INFO xxxxx starting loading checkpoint
31 Oct 18:30 INFO Using flash attention False for llama
31 Oct 18:30 INFO Init True for llama
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model.
31 Oct 18:30 INFO create user llm
31 Oct 18:30 INFO create LLM ../user_pretrain
31 Oct 18:30 INFO hf_config: LlamaConfig {
"_name_or_path": "../user_pretrain",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 0,
"eos_token_id": 0,
"hidden_act": "silu",
"hidden_size": 576,
"initializer_range": 0.02,
"intermediate_size": 1536,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 9,
"num_hidden_layers": 30,
"num_key_value_heads": 3,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.1",
"use_cache": true,
"vocab_size": 49152
}
31 Oct 18:30 INFO xxxxx starting loading checkpoint
31 Oct 18:30 INFO Using flash attention False for llama
31 Oct 18:30 INFO Init True for llama
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model.
31 Oct 18:30 INFO nce thres setting to 0.99
31 Oct 18:30 INFO item_emb_tokens torch.Size([1, 1, 576]) True
31 Oct 18:30 INFO logit_scale torch.Size([]) True
31 Oct 18:30 INFO item_llm.model.embed_tokens.weight torch.Size([49152, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO item_llm.model.layers.0.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.v_proj.weight torch.Size([192, 576]) True
...
31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.26.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.27.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.28.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.29.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.norm.weight torch.Size([576]) True
31 Oct 18:30 INFO
World_Size = 1
31 Oct 18:30 INFO
General Hyper Parameters:
seed = 2020
state = INFO
use_text = True
reproducibility = True
checkpoint_dir = saved
show_progress = True
log_wandb = False
data_path = ../dataset/
strategy = deepspeed
precision = bf16-mixed
model = HLLM
31 Oct 18:30 INFO Pixel200K
The number of users: 200001
Average actions of users: 19.82828
The number of items: 96283
Average actions of items: 41.187927130720176
The number of inters: 3965656
The sparsity of the dataset: 99.9794063532928%
31 Oct 18:30 INFO HLLM(
(item_llm): LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(49152, 576)
(layers): ModuleList(
(0-29): 30 x LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=576, out_features=576, bias=False)
(k_proj): Linear(in_features=576, out_features=192, bias=False)
(v_proj): Linear(in_features=576, out_features=192, bias=False)
(o_proj): Linear(in_features=576, out_features=576, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=576, out_features=1536, bias=False)
(up_proj): Linear(in_features=576, out_features=1536, bias=False)
(down_proj): Linear(in_features=1536, out_features=576, bias=False)
(act_fn): SiLUActivation()
)
(input_layernorm): LlamaRMSNorm()
(post_attention_layernorm): LlamaRMSNorm()
)
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=576, out_features=49152, bias=False)
)
(user_llm): LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(49152, 576)
(layers): ModuleList(
(0-29): 30 x LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=576, out_features=576, bias=False)
(k_proj): Linear(in_features=576, out_features=192, bias=False)
(v_proj): Linear(in_features=576, out_features=192, bias=False)
(o_proj): Linear(in_features=576, out_features=576, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=576, out_features=1536, bias=False)
(up_proj): Linear(in_features=576, out_features=1536, bias=False)
(down_proj): Linear(in_features=1536, out_features=576, bias=False)
(act_fn): SiLUActivation()
)
(input_layernorm): LlamaRMSNorm()
(post_attention_layernorm): LlamaRMSNorm()
)
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=576, out_features=49152, bias=False)
)
)
Trainable parameters: 269030593.0
31 Oct 18:30 INFO Use consine scheduler with 204021.5 warmup 2040215 total steps
31 Oct 18:30 INFO Use deepspeed strategy
initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/1
Enabling DeepSpeed BF16. Model parameters and inputs will be cast to bfloat16.
31 Oct 18:30 INFO Added key: store_based_barrier_key:2 to store for rank: 0
31 Oct 18:30 INFO Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 1 nodes.
Parameter Offload: Total persistent parameters: 70849 in 124 params
Train [ 0/ 5]: 0%| | 0/408043 [00:00<?, ?it/s]/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 11 worker processes in total. Our suggested max number of worker in current system is 10, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
elem_shape: torch.Size([4]) elem: tensor([35434, 40551, 37030, 59065])
batch_len: 2 batch: [tensor([35434, 40551, 37030, 59065]), tensor([12579, 4861, 4391, 11309])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4]) elem: tensor([87871, 81222, 81198, 22876])
batch_len: 2 batch: [tensor([87871, 81222, 81198, 22876]), tensor([48319, 60427, 57998, 62150])]
elem_shape: torch.Size([3]) elem: tensor([64465, 17314, 64780, 42073])
batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([64465, 17314, 64780, 42073]), tensor([30084, 36569, 75655, 9662])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4]) elem_shape: elem: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
tensor([[2021, 5, 11, 22, 21, 22],
[2021, 6, 26, 3, 21, 25],
[2021, 6, 26, 19, 6, 11],
[2021, 7, 2, 3, 9, 1]])
batch_len: 2 batch: tensor([ 425, 14391, 328, 17107])
batch_len: 2 batch: tensor([ 847, 39123, 15991, 22791])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])[tensor([ 425, 14391, 328, 17107]), tensor([ 391, 13496, 15739, 5500])]
elem: [tensor([ 847, 39123, 15991, 22791]), tensor([41123, 14041, 23611, 73814])]
tensor([51018, 50191, 91965, 58533])
batch_len: 2 batch: [tensor([[2021, 5, 11, 22, 21, 22],
[2021, 6, 26, 3, 21, 25],
[2021, 6, 26, 19, 6, 11],
[2021, 7, 2, 3, 9, 1]]), tensor([[2019, 4, 2, 16, 6, 14],
[2019, 5, 13, 15, 55, 25],
[2019, 5, 27, 15, 25, 55],
[2020, 3, 13, 6, 15, 16]])]
[tensor([51018, 50191, 91965, 58533]), tensor([44203, 41980, 80114, 74846])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])elem_shape: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem: elem_shape: torch.Size([4])torch.Size([4]) [tensor([1, 1, 1]), tensor([1, 1, 1])]elem: elem:
tensor([75806, 61616, 23817, 95456])
batch_len: 2 batch: tensor([71534, 69682, 37869, 58014])
batch_len: 2 /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
batch: elem_shape: elem_shape: torch.Size([4, 6]) elem: torch.Size([4]) elem: [tensor([75806, 61616, 23817, 95456]), tensor([53752, 39814, 26241, 94833])]
[tensor([71534, 69682, 37869, 58014]), tensor([67918, 40640, 76791, 7968])]
tensor([23374, 8848, 3948, 36802])
batch_len: 2tensor([[2020, 8, 23, 14, 40, 54],
[2021, 5, 2, 8, 51, 43],
[2021, 6, 26, 13, 45, 19],
[2021, 7, 23, 7, 49, 36]])
batch: batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
tensor([ 2969, 25609, 13490, 14848])batch_len:
batch_len: 2 2 batch: batch: [tensor([23374, 8848, 3948, 36802]), tensor([46118, 11192, 20427, 6691])]
elem_shape: torch.Size([4]) elem: [tensor([ 2969, 25609, 13490, 14848]), tensor([78124, 39314, 69903, 52117])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
[tensor([[2020, 8, 23, 14, 40, 54],
[2021, 5, 2, 8, 51, 43],
[2021, 6, 26, 13, 45, 19],
[2021, 7, 23, 7, 49, 36]]), tensor([[2020, 7, 25, 16, 56, 10],
[2020, 8, 31, 10, 45, 23],
[2020, 8, 31, 11, 10, 33],
[2020, 9, 9, 8, 37, 37]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
tensor([18921, 20968, 23234, 15771])
batch_len: 2 batch: tensor([[2020, 5, 16, 9, 20, 33],
[2020, 8, 12, 12, 42, 53],
[2020, 8, 25, 23, 54, 8],
[2020, 12, 9, 8, 38, 4]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])elem_shape: elem: torch.Size([4]) elem: [tensor([18921, 20968, 23234, 15771]), tensor([41684, 16594, 21227, 22578])]
elem_shape: torch.Size([4]) elem: tensor([ 3645, 46545, 23431, 92630])
batch_len: 2 tensor([ 1778, 14200, 60825, 10431])batch:
batch_len: 2 batch: [tensor([ 3645, 46545, 23431, 92630]), tensor([21302, 48795, 47488, 34331])]
[tensor([ 1778, 14200, 60825, 10431]), tensor([82015, 85877, 42269, 53227])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
[tensor([[2020, 5, 16, 9, 20, 33],
[2020, 8, 12, 12, 42, 53],
[2020, 8, 25, 23, 54, 8],
[2020, 12, 9, 8, 38, 4]]), tensor([[2021, 1, 2, 1, 55, 10],
[2021, 1, 4, 12, 57, 38],
[2021, 1, 10, 8, 48, 46],
[2021, 1, 25, 8, 36, 14]])]
elem_shape: torch.Size([3]) elem: tensor([75506, 5080, 325, 80292])
elem_shape: batch_len: 2 torch.Size([3])batch: elem: tensor([1, 1, 1])/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([75506, 5080, 325, 80292]), tensor([ 8350, 15011, 15842, 12424])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([62601, 73106, 10840, 84653])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([62601, 73106, 10840, 84653]), tensor([48662, 22713, 69375, 59])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
tensor([[2021, 7, 30, 13, 3, 55],
[2021, 8, 2, 7, 9, 44],
[2021, 10, 7, 3, 53, 40],
[2021, 12, 18, 7, 53, 28]])
batch_len: 2 batch: tensor([[2020, 10, 15, 10, 0, 6],
[2020, 11, 4, 20, 23, 30],
[2020, 11, 9, 20, 0, 49],
[2020, 11, 25, 7, 34, 25]])
batch_len: 2elem_shape: batch: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
2 elem_shape: batch: torch.Size([4]) elem: tensor([36270, 52981, 82544, 71318])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
[tensor([36270, 52981, 82544, 71318]), tensor([58128, 32419, 83234, 4544])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([[2020, 10, 15, 10, 0, 6],
[2020, 11, 4, 20, 23, 30],
[2020, 11, 9, 20, 0, 49],
[2020, 11, 25, 7, 34, 25]]), tensor([[2022, 1, 10, 13, 26, 1],
[2022, 1, 16, 11, 39, 38],
[2022, 1, 16, 12, 32, 54],
[2022, 1, 16, 12, 35, 23]])]
[tensor([[2021, 7, 30, 13, 3, 55],
[2021, 8, 2, 7, 9, 44],
[2021, 10, 7, 3, 53, 40],
[2021, 12, 18, 7, 53, 28]]), tensor([[2022, 3, 7, 17, 25, 50],
[2022, 3, 18, 17, 32, 55],
[2022, 3, 24, 12, 58, 29],
[2022, 4, 17, 13, 59, 9]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: tensor([[2019, 10, 10, 11, 54, 1],
[2019, 10, 28, 9, 41, 15],
[2019, 11, 1, 12, 55, 38],
[2019, 11, 2, 7, 20, 17]])2
batch: batch_len: 2 elem_shape: batch: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([11709, 58322, 49677, 18878])
batch_len: 2 batch: [tensor([[2019, 10, 10, 11, 54, 1],
[2019, 10, 28, 9, 41, 15],
[2019, 11, 1, 12, 55, 38],
[2019, 11, 2, 7, 20, 17]]), tensor([[2021, 10, 21, 10, 57, 1],
[2021, 10, 28, 4, 41, 13],
[2021, 10, 31, 14, 23, 40],
[2021, 11, 3, 14, 10, 39]])]
tensor([[2022, 1, 9, 4, 24, 57],
[2022, 1, 11, 15, 1, 33],
[2022, 1, 13, 17, 10, 21],
[2022, 2, 24, 16, 4, 35]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
[tensor([11709, 58322, 49677, 18878]), tensor([72797, 27520, 313, 9565])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
[tensor([[2022, 1, 9, 4, 24, 57],
[2022, 1, 11, 15, 1, 33],
[2022, 1, 13, 17, 10, 21],
[2022, 2, 24, 16, 4, 35]]), tensor([[2021, 9, 11, 8, 53, 52],
[2021, 10, 1, 7, 4, 45],
[2021, 11, 1, 6, 57, 6],
[2021, 11, 2, 12, 50, 55]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
tensor([23121, 94384, 64729, 63419])
batch_len: 2 batch: [tensor([23121, 94384, 64729, 63419]), tensor([68192, 22562, 91147, 53581])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: tensor([[2017, 6, 15, 15, 24, 56],
[2018, 9, 2, 12, 31, 42],
[2018, 12, 5, 13, 24, 18],
[2018, 12, 30, 11, 41, 17]])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([[2017, 6, 15, 15, 24, 56],
[2018, 9, 2, 12, 31, 42],
[2018, 12, 5, 13, 24, 18],
[2018, 12, 30, 11, 41, 17]]), tensor([[2021, 10, 16, 21, 55, 55],
[2021, 10, 17, 11, 25, 49],
[2021, 10, 22, 12, 27, 43],
[2021, 10, 24, 10, 50, 51]])]
tensor([[2021, 8, 13, 1, 4, 58],
[2021, 8, 29, 5, 0, 33],
[2021, 8, 29, 8, 9, 38],
[2021, 9, 15, 4, 46, 37]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
[tensor([[2021, 8, 13, 1, 4, 58],
[2021, 8, 29, 5, 0, 33],
[2021, 8, 29, 8, 9, 38],
[2021, 9, 15, 4, 46, 37]]), tensor([[2021, 5, 8, 11, 38, 37],
[2021, 6, 6, 12, 34, 21],
[2021, 9, 4, 12, 5, 10],
[2021, 9, 16, 15, 41, 53]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([38585, 10559, 37311, 1179])
batch_len: 2 batch: [tensor([38585, 10559, 37311, 1179]), tensor([23749, 14811, 19801, 8735])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([67086, 15614, 41876, 65866])
batch_len: 2 batch: [tensor([67086, 15614, 41876, 65866]), tensor([78779, 84830, 13453, 29267])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([[2018, 11, 25, 8, 42, 38],
[2019, 8, 20, 8, 15, 18],
[2019, 9, 23, 9, 21, 23],
[2019, 10, 15, 4, 48, 33]])
batch_len: 2 batch: [tensor([[2018, 11, 25, 8, 42, 38],
[2019, 8, 20, 8, 15, 18],
[2019, 9, 23, 9, 21, 23],
[2019, 10, 15, 4, 48, 33]]), tensor([[2021, 12, 18, 7, 8, 29],
[2022, 1, 23, 2, 26, 47],
[2022, 2, 15, 8, 20, 36],
[2022, 2, 28, 7, 13, 14]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([ 6662, 4673, 11970, 3575])
tensor([38929, 55609, 36272, 39413])batch_len:
2batch_len: batch: 2 batch: [tensor([38929, 55609, 36272, 39413]), tensor([10936, 63753, 14848, 14360])]
[tensor([ 6662, 4673, 11970, 3575]), tensor([44828, 36185, 87193, 63846])]
elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([32135, 93929, 34764, 69886])
batch_len: 2tensor([45581, 24052, 52267, 23602])
batch: batch_len: 2 batch: [tensor([32135, 93929, 34764, 69886]), tensor([54510, 83382, 65314, 46423])]
[tensor([45581, 24052, 52267, 23602]), tensor([37862, 78032, 23472, 36266])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6])elem_shape: elem: torch.Size([4]) elem: tensor([25029, 50735, 72231, 82790])
tensor([[2020, 6, 22, 4, 8, 58],
[2020, 6, 24, 10, 35, 59],
[2021, 4, 25, 13, 5, 40],
[2021, 6, 30, 23, 12, 22]])batch_len:
2batch_len: elem_shape: batch: 2 batch: torch.Size([4]) elem: tensor([[2020, 4, 5, 0, 51, 10],
[2020, 4, 25, 5, 5, 49],
[2020, 4, 30, 14, 29, 13],
[2020, 5, 4, 13, 33, 40]])
tensor([ 4706, 45075, 33534, 35891])batch_len:
2batch_len: [tensor([25029, 50735, 72231, 82790]), tensor([36757, 13822, 39746, 12628])]batch: 2
batch: elem_shape: torch.Size([4]) elem: [tensor([ 4706, 45075, 33534, 35891]), tensor([14289, 39310, 54955, 22678])]
tensor([ 8623, 49238, 74673, 30410])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: elem_shape: tensor([85332, 83015, 13114, 16238])[tensor([ 8623, 49238, 74673, 30410]), tensor([ 5683, 50161, 52544, 12805])]
batch_len: [tensor([[2020, 3, 22, 6, 8, 58],
[2020, 3, 23, 15, 57, 24],
[2020, 4, 1, 17, 19, 52],
[2020, 4, 23, 8, 40, 44]]), tensor([[2021, 12, 10, 14, 1, 52],
[2022, 1, 21, 2, 55, 46],
[2022, 1, 30, 9, 57, 52],
[2022, 4, 8, 5, 5, 32]])]2
batch: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([80933, 54961, 71171, 20766])
batch_len: 2 batch: elem_shape: torch.Size([4, 6])[tensor([1, 1, 1]), tensor([1, 1, 1])] elem:
elem_shape: torch.Size([4, 6]) elem: [tensor([80933, 54961, 71171, 20766]), tensor([27471, 12016, 60008, 31466])]
tensor([[2021, 4, 16, 18, 52, 24],
[2021, 4, 24, 4, 1, 15],
[2021, 4, 28, 12, 34, 40],
[2021, 5, 27, 14, 53, 32]])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([[2022, 1, 19, 8, 21, 4],
[2022, 1, 25, 13, 21, 57],
[2022, 1, 25, 13, 26, 8],
[2022, 1, 25, 14, 37, 19]])
batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: [tensor([[2021, 4, 16, 18, 52, 24],
[2021, 4, 24, 4, 1, 15],
[2021, 4, 28, 12, 34, 40],
[2021, 5, 27, 14, 53, 32]]), tensor([[2022, 2, 11, 10, 37, 8],
[2022, 2, 14, 9, 58, 15],
[2022, 2, 15, 10, 35, 35],
[2022, 2, 25, 11, 35, 51]])]
[tensor([[2022, 1, 19, 8, 21, 4],
[2022, 1, 25, 13, 21, 57],
[2022, 1, 25, 13, 26, 8],
[2022, 1, 25, 14, 37, 19]]), tensor([[2021, 8, 8, 4, 48, 52],
[2021, 8, 9, 11, 59, 3],
[2021, 8, 11, 2, 36, 48],
[2021, 8, 15, 9, 18, 21]])]
tensor([[2019, 12, 21, 0, 20, 2],
[2019, 12, 24, 4, 56, 0],
[2020, 6, 3, 14, 42, 35],
[2020, 6, 7, 12, 34, 55]])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([[2019, 12, 21, 0, 20, 2],
[2019, 12, 24, 4, 56, 0],
[2020, 6, 3, 14, 42, 35],
[2020, 6, 7, 12, 34, 55]]), tensor([[2021, 8, 5, 16, 34, 44],
[2021, 8, 10, 10, 42, 13],
[2021, 8, 15, 5, 23, 11],
[2021, 8, 25, 5, 38, 34]])]
tensor([25996, 21510, 15073, 22007])
batch_len: 2 batch: [tensor([25996, 21510, 15073, 22007]), tensor([20878, 33688, 13673, 23473])]
elem_shape: torch.Size([4]) elem: tensor([54102, 80143, 42441, 14827])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)
[tensor([54102, 80143, 42441, 14827]), tensor([64859, 75260, 51319, 27606])]
elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([89791, 90644, 92446, 54050])
batch_len: 2 batch: tensor([65655, 37454, 50466, 48762])
batch_len: 2 batch: [tensor([89791, 90644, 92446, 54050]), tensor([56544, 83106, 41060, 1607])]
[tensor([65655, 37454, 50466, 48762]), tensor([60052, 200, 61482, 89754])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 6, 4, 10, 56, 45],
[2021, 6, 9, 4, 12, 42],
[2021, 6, 13, 8, 18, 31],
[2021, 7, 10, 15, 46, 59]])
batch_len: 2 batch: tensor([[2021, 5, 26, 8, 51, 21],
[2021, 6, 2, 5, 45, 4],
[2021, 6, 2, 10, 47, 17],
[2021, 6, 14, 23, 51, 24]])
batch_len: 2 batch: [tensor([[2021, 6, 4, 10, 56, 45],
[2021, 6, 9, 4, 12, 42],
[2021, 6, 13, 8, 18, 31],
[2021, 7, 10, 15, 46, 59]]), tensor([[2018, 11, 16, 15, 22, 34],
[2019, 4, 6, 18, 8, 30],
[2019, 4, 19, 9, 34, 45],
[2019, 4, 30, 13, 9, 4]])]
[tensor([[2021, 5, 26, 8, 51, 21],
[2021, 6, 2, 5, 45, 4],
[2021, 6, 2, 10, 47, 17],
[2021, 6, 14, 23, 51, 24]]), tensor([[2020, 8, 29, 12, 27, 53],
[2020, 9, 4, 3, 29, 46],
[2020, 9, 16, 12, 10, 11],
[2020, 9, 16, 12, 15, 40]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([54514, 27056, 33162, 20521])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([54514, 27056, 33162, 20521]), tensor([70133, 32499, 25117, 42046])]
elem_shape: torch.Size([4]) tensor([49495, 10004, 10945, 30683])elem:
batch_len: 2 batch: tensor([95307, 57247, 49990, 17365])
batch_len: 2 batch: [tensor([95307, 57247, 49990, 17365]), tensor([75868, 79781, 39034, 12100])]
[tensor([49495, 10004, 10945, 30683]), tensor([ 2930, 15228, 46649, 36206])]
elem_shape: torch.Size([3]) elem: elem_shape: torch.Size([4]) elem: tensor([1, 1, 1])
batch_len: 2 batch: tensor([75048, 81460, 72950, 55131])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
[tensor([75048, 81460, 72950, 55131]), tensor([76877, 71803, 95754, 64344])]
elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([[2020, 9, 25, 5, 37, 22],
[2020, 12, 16, 14, 21, 19],
[2021, 4, 7, 11, 43, 34],
[2021, 4, 12, 10, 31, 36]])
batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 2, 26, 14, 46, 20],
[2021, 4, 5, 12, 24, 38],
[2021, 4, 5, 13, 3, 52],
[2021, 5, 15, 17, 1, 36]])
batch_len: 2 batch: [tensor([[2020, 9, 25, 5, 37, 22],
[2020, 12, 16, 14, 21, 19],
[2021, 4, 7, 11, 43, 34],
[2021, 4, 12, 10, 31, 36]]), tensor([[2019, 2, 10, 7, 8, 58],
[2019, 2, 10, 7, 26, 39],
[2019, 4, 3, 10, 11, 41],
[2019, 4, 11, 9, 40, 53]])]
[tensor([[2021, 2, 26, 14, 46, 20],
[2021, 4, 5, 12, 24, 38],
[2021, 4, 5, 13, 3, 52],
[2021, 5, 15, 17, 1, 36]]), tensor([[2021, 8, 30, 10, 47, 40],
[2021, 9, 28, 12, 1, 47],
[2021, 9, 29, 16, 14, 55],
[2021, 10, 1, 7, 18, 49]])]
elem_shape: torch.Size([4]) elem: tensor([45360, 536, 48045, 26275])
batch_len: 2 batch: [tensor([45360, 536, 48045, 26275]), tensor([20411, 11698, 21909, 15291])]
elem_shape: torch.Size([4]) elem: tensor([79237, 53705, 87849, 26910])
batch_len: 2 batch: [tensor([79237, 53705, 87849, 26910]), tensor([13761, 19517, 61800, 59291])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: tensor([[2022, 2, 11, 1, 20, 13],
[2022, 2, 11, 1, 23, 9],
[2022, 2, 11, 2, 3, 3],
[2022, 2, 11, 6, 13, 31]])
batch_len: 2 batch: [tensor([[2022, 2, 11, 1, 20, 13],
[2022, 2, 11, 1, 23, 9],
[2022, 2, 11, 2, 3, 3],
[2022, 2, 11, 6, 13, 31]]), tensor([[2020, 7, 18, 15, 24, 36],
[2020, 8, 2, 16, 48, 4],
[2020, 10, 10, 17, 13, 9],
[2020, 11, 24, 6, 57, 41]])]
Traceback (most recent call last):
File "/data/home/xconnorwang/HLLM/code/run.py", line 139, in
run_loop(local_rank=local_rank, config_file=config_file, extra_args=extra_args)
File "/data/home/xconnorwang/HLLM/code/run.py", line 110, in run_loop
best_valid_score, best_valid_result = trainer.fit(
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 342, in fit
train_loss = self._train_epoch(train_data, epoch_idx, show_progress=show_progress)
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 198, in _train_epoch
self.lite.backward(losses)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/fabric.py", line 446, in backward
self._strategy.backward(tensor, module, *args, kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/strategies/strategy.py", line 188, in backward
self.precision.backward(tensor, module, *args, *kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/plugins/precision/deepspeed.py", line 91, in backward
model.backward(tensor, args, kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, *kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1976, in backward
self.optimizer.backward(loss, retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/zero/stage3.py", line 2213, in backward
self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward
scaled_loss.backward(retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1
Train [ 0/ 5]: 0%| | 0/408043 [00:05<?, ?it/s]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1073205) of binary: /usr/local/python3/bin/python3.9
Traceback (most recent call last):
File "/data/home/xconnorwang/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1 Error only if i change gradient_checkpointing into True, and that waste lots of memory
environment as follows: cuda==11.7(limited by machine) python==3.9 pytorch==2.0.1(limited by cuda) deepspeed==0.14.2 transformers==4.41.1 lightning==2.3.0 (lightning==2.4.0 need torch<4.0,>=2.1.0) wheel==0.44.0 flash-attn==2.6.3 (False,limited by machine) fbgemm-gpu==0.5.0 sentencepiece==0.2.0 pandas==2.2.3 colorlog==6.9.0 tensorboardX==2.6.2.2 tensorflow_cpu==2.8.0 colorama==0.4.6 torch_geometric==2.5.3 scikit-learn==1.5.2 protobuf==3.20
ERROR LOG:
++ date +%FT%T
31 Oct 18:30 INFO xxxxx starting loading checkpoint 31 Oct 18:30 INFO Using flash attention False for llama 31 Oct 18:30 INFO Init True for llama You are using an old version of the checkpointing format that is deprecated (We will also silently ignore
gradient_checkpointing_kwargs
in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method_set_gradient_checkpointing
in your model. 31 Oct 18:30 INFO create user llm 31 Oct 18:30 INFO create LLM ../user_pretrain 31 Oct 18:30 INFO hf_config: LlamaConfig { "_name_or_path": "../user_pretrain", "architectures": [ "LlamaForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 0, "eos_token_id": 0, "hidden_act": "silu", "hidden_size": 576, "initializer_range": 0.02, "intermediate_size": 1536, "max_position_embeddings": 2048, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 9, "num_hidden_layers": 30, "num_key_value_heads": 3, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": true, "torch_dtype": "bfloat16", "transformers_version": "4.41.1", "use_cache": true, "vocab_size": 49152 }31 Oct 18:30 INFO xxxxx starting loading checkpoint 31 Oct 18:30 INFO Using flash attention False for llama 31 Oct 18:30 INFO Init True for llama You are using an old version of the checkpointing format that is deprecated (We will also silently ignore
gradient_checkpointing_kwargs
in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method_set_gradient_checkpointing
in your model. 31 Oct 18:30 INFO nce thres setting to 0.99 31 Oct 18:30 INFO item_emb_tokens torch.Size([1, 1, 576]) True 31 Oct 18:30 INFO logit_scale torch.Size([]) True 31 Oct 18:30 INFO item_llm.model.embed_tokens.weight torch.Size([49152, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.q_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.k_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.v_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.o_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.mlp.gate_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.mlp.up_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.mlp.down_proj.weight torch.Size([576, 1536]) True 31 Oct 18:30 INFO item_llm.model.layers.0.input_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO item_llm.model.layers.0.post_attention_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.q_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.k_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.v_proj.weight torch.Size([192, 576]) True ... 31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.v_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.o_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.26.mlp.gate_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.26.mlp.up_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.26.mlp.down_proj.weight torch.Size([576, 1536]) True 31 Oct 18:30 INFO user_llm.model.layers.26.input_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.26.post_attention_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.q_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.k_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.v_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.o_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.mlp.gate_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.mlp.up_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.mlp.down_proj.weight torch.Size([576, 1536]) True 31 Oct 18:30 INFO user_llm.model.layers.27.input_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.27.post_attention_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.q_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.k_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.v_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.o_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.mlp.gate_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.mlp.up_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.mlp.down_proj.weight torch.Size([576, 1536]) True 31 Oct 18:30 INFO user_llm.model.layers.28.input_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.28.post_attention_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.q_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.k_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.v_proj.weight torch.Size([192, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.o_proj.weight torch.Size([576, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.mlp.gate_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.mlp.up_proj.weight torch.Size([1536, 576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.mlp.down_proj.weight torch.Size([576, 1536]) True 31 Oct 18:30 INFO user_llm.model.layers.29.input_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.layers.29.post_attention_layernorm.weight torch.Size([576]) True 31 Oct 18:30 INFO user_llm.model.norm.weight torch.Size([576]) True 31 Oct 18:30 INFOWorld_Size = 1
31 Oct 18:30 INFO
General Hyper Parameters: seed = 2020 state = INFO use_text = True reproducibility = True checkpoint_dir = saved show_progress = True log_wandb = False data_path = ../dataset/ strategy = deepspeed precision = bf16-mixed model = HLLM
Training Hyper Parameters: epochs = 5 train_batch_size = 2 optim_args = {'learning_rate': 0.0001, 'weight_decay': 0.01} eval_step = 1 stopping_step = 5
Evaluation Hyper Parameters: eval_batch_size = 3 topk = [5, 10, 50, 200] metrics = ['Recall', 'NDCG'] valid_metric = NDCG@200 metric_decimal_place = 7 eval_type = EvaluatorType.RANKING valid_metric_bigger = True
Dataset Hyper Parameters: MAX_ITEM_LIST_LENGTH = 3 MAX_TEXT_LENGTH = 3 text_keys = ['title', 'tag', 'description'] item_prompt = Compress the following sentence into embedding:
Other Hyper Parameters: wandb_project = REC text_path = /data/home/xconnorwang/HLLM/information/Pixel200K.csv item_emb_token_n = 1 loss = nce scheduler_args = {'type': 'cosine', 'warmup': 0.1} stage = 3 gradient_checkpointing = True zero3_init_flag = False item_pretrain_dir = ../item_pretrain item_llm_init = True user_pretrain_dir = ../user_pretrain user_llm_init = True use_ft_flash_attn = False MODEL_INPUT_TYPE = InputType.SEQ device = cuda:0
31 Oct 18:30 INFO Pixel200K The number of users: 200001 Average actions of users: 19.82828 The number of items: 96283 Average actions of items: 41.187927130720176 The number of inters: 3965656 The sparsity of the dataset: 99.9794063532928% 31 Oct 18:30 INFO HLLM( (item_llm): LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(49152, 576) (layers): ModuleList( (0-29): 30 x LlamaDecoderLayer( (self_attn): LlamaAttention( (q_proj): Linear(in_features=576, out_features=576, bias=False) (k_proj): Linear(in_features=576, out_features=192, bias=False) (v_proj): Linear(in_features=576, out_features=192, bias=False) (o_proj): Linear(in_features=576, out_features=576, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): LlamaMLP( (gate_proj): Linear(in_features=576, out_features=1536, bias=False) (up_proj): Linear(in_features=576, out_features=1536, bias=False) (down_proj): Linear(in_features=1536, out_features=576, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=576, out_features=49152, bias=False) ) (user_llm): LlamaForCausalLM( (model): LlamaModel( (embed_tokens): Embedding(49152, 576) (layers): ModuleList( (0-29): 30 x LlamaDecoderLayer( (self_attn): LlamaAttention( (q_proj): Linear(in_features=576, out_features=576, bias=False) (k_proj): Linear(in_features=576, out_features=192, bias=False) (v_proj): Linear(in_features=576, out_features=192, bias=False) (o_proj): Linear(in_features=576, out_features=576, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): LlamaMLP( (gate_proj): Linear(in_features=576, out_features=1536, bias=False) (up_proj): Linear(in_features=576, out_features=1536, bias=False) (down_proj): Linear(in_features=1536, out_features=576, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=576, out_features=49152, bias=False) ) ) Trainable parameters: 269030593.0 31 Oct 18:30 INFO Use consine scheduler with 204021.5 warmup 2040215 total steps 31 Oct 18:30 INFO Use deepspeed strategy initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/1 Enabling DeepSpeed BF16. Model parameters and inputs will be cast to
bfloat16
. 31 Oct 18:30 INFO Added key: store_based_barrier_key:2 to store for rank: 0 31 Oct 18:30 INFO Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 1 nodes. Parameter Offload: Total persistent parameters: 70849 in 124 paramsTrain [ 0/ 5]: 0%| | 0/408043 [00:00<?, ?it/s]/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 11 worker processes in total. Our suggested max number of worker in current system is 10, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( elem_shape: torch.Size([4]) elem: tensor([35434, 40551, 37030, 59065]) batch_len: 2 batch: [tensor([35434, 40551, 37030, 59065]), tensor([12579, 4861, 4391, 11309])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4]) elem: tensor([87871, 81222, 81198, 22876]) batch_len: 2 batch: [tensor([87871, 81222, 81198, 22876]), tensor([48319, 60427, 57998, 62150])] elem_shape: torch.Size([3]) elem: tensor([64465, 17314, 64780, 42073]) batch_len: 2 batch: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([64465, 17314, 64780, 42073]), tensor([30084, 36569, 75655, 9662])] [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4]) elem_shape: elem: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) tensor([[2021, 5, 11, 22, 21, 22], [2021, 6, 26, 3, 21, 25], [2021, 6, 26, 19, 6, 11], [2021, 7, 2, 3, 9, 1]]) batch_len: 2 batch: tensor([ 425, 14391, 328, 17107]) batch_len: 2 batch: tensor([ 847, 39123, 15991, 22791]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4])[tensor([ 425, 14391, 328, 17107]), tensor([ 391, 13496, 15739, 5500])] elem: [tensor([ 847, 39123, 15991, 22791]), tensor([41123, 14041, 23611, 73814])] tensor([51018, 50191, 91965, 58533]) batch_len: 2 batch: [tensor([[2021, 5, 11, 22, 21, 22], [2021, 6, 26, 3, 21, 25], [2021, 6, 26, 19, 6, 11], [2021, 7, 2, 3, 9, 1]]), tensor([[2019, 4, 2, 16, 6, 14], [2019, 5, 13, 15, 55, 25], [2019, 5, 27, 15, 25, 55], [2020, 3, 13, 6, 15, 16]])] [tensor([51018, 50191, 91965, 58533]), tensor([44203, 41980, 80114, 74846])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4])elem_shape: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem: elem_shape: torch.Size([4])torch.Size([4]) [tensor([1, 1, 1]), tensor([1, 1, 1])]elem: elem:
tensor([75806, 61616, 23817, 95456]) batch_len: 2 batch: tensor([71534, 69682, 37869, 58014]) batch_len: 2 /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) batch: elem_shape: elem_shape: torch.Size([4, 6]) elem: torch.Size([4]) elem: [tensor([75806, 61616, 23817, 95456]), tensor([53752, 39814, 26241, 94833])] [tensor([71534, 69682, 37869, 58014]), tensor([67918, 40640, 76791, 7968])] tensor([23374, 8848, 3948, 36802]) batch_len: 2tensor([[2020, 8, 23, 14, 40, 54], [2021, 5, 2, 8, 51, 43], [2021, 6, 26, 13, 45, 19], [2021, 7, 23, 7, 49, 36]]) batch: batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) tensor([ 2969, 25609, 13490, 14848])batch_len: batch_len: 2 2 batch: batch: [tensor([23374, 8848, 3948, 36802]), tensor([46118, 11192, 20427, 6691])] elem_shape: torch.Size([4]) elem: [tensor([ 2969, 25609, 13490, 14848]), tensor([78124, 39314, 69903, 52117])] [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) [tensor([[2020, 8, 23, 14, 40, 54], [2021, 5, 2, 8, 51, 43], [2021, 6, 26, 13, 45, 19], [2021, 7, 23, 7, 49, 36]]), tensor([[2020, 7, 25, 16, 56, 10], [2020, 8, 31, 10, 45, 23], [2020, 8, 31, 11, 10, 33], [2020, 9, 9, 8, 37, 37]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) tensor([18921, 20968, 23234, 15771]) batch_len: 2 batch: tensor([[2020, 5, 16, 9, 20, 33], [2020, 8, 12, 12, 42, 53], [2020, 8, 25, 23, 54, 8], [2020, 12, 9, 8, 38, 4]]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4])elem_shape: elem: torch.Size([4]) elem: [tensor([18921, 20968, 23234, 15771]), tensor([41684, 16594, 21227, 22578])] elem_shape: torch.Size([4]) elem: tensor([ 3645, 46545, 23431, 92630]) batch_len: 2 tensor([ 1778, 14200, 60825, 10431])batch:
batch_len: 2 batch: [tensor([ 3645, 46545, 23431, 92630]), tensor([21302, 48795, 47488, 34331])] [tensor([ 1778, 14200, 60825, 10431]), tensor([82015, 85877, 42269, 53227])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) [tensor([[2020, 5, 16, 9, 20, 33], [2020, 8, 12, 12, 42, 53], [2020, 8, 25, 23, 54, 8], [2020, 12, 9, 8, 38, 4]]), tensor([[2021, 1, 2, 1, 55, 10], [2021, 1, 4, 12, 57, 38], [2021, 1, 10, 8, 48, 46], [2021, 1, 25, 8, 36, 14]])] elem_shape: torch.Size([3]) elem: tensor([75506, 5080, 325, 80292]) elem_shape: batch_len: 2 torch.Size([3])batch: elem: tensor([1, 1, 1])/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out)
batch_len: 2 batch: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([75506, 5080, 325, 80292]), tensor([ 8350, 15011, 15842, 12424])] [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: tensor([62601, 73106, 10840, 84653]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: [tensor([62601, 73106, 10840, 84653]), tensor([48662, 22713, 69375, 59])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel)
tensor([[2021, 7, 30, 13, 3, 55], [2021, 8, 2, 7, 9, 44], [2021, 10, 7, 3, 53, 40], [2021, 12, 18, 7, 53, 28]]) batch_len: 2 batch: tensor([[2020, 10, 15, 10, 0, 6], [2020, 11, 4, 20, 23, 30], [2020, 11, 9, 20, 0, 49], [2020, 11, 25, 7, 34, 25]]) batch_len: 2elem_shape: batch: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) 2 elem_shape: batch: torch.Size([4]) elem: tensor([36270, 52981, 82544, 71318]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] [tensor([36270, 52981, 82544, 71318]), tensor([58128, 32419, 83234, 4544])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([[2020, 10, 15, 10, 0, 6], [2020, 11, 4, 20, 23, 30], [2020, 11, 9, 20, 0, 49], [2020, 11, 25, 7, 34, 25]]), tensor([[2022, 1, 10, 13, 26, 1], [2022, 1, 16, 11, 39, 38], [2022, 1, 16, 12, 32, 54], [2022, 1, 16, 12, 35, 23]])] [tensor([[2021, 7, 30, 13, 3, 55], [2021, 8, 2, 7, 9, 44], [2021, 10, 7, 3, 53, 40], [2021, 12, 18, 7, 53, 28]]), tensor([[2022, 3, 7, 17, 25, 50], [2022, 3, 18, 17, 32, 55], [2022, 3, 24, 12, 58, 29], [2022, 4, 17, 13, 59, 9]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: tensor([[2019, 10, 10, 11, 54, 1], [2019, 10, 28, 9, 41, 15], [2019, 11, 1, 12, 55, 38], [2019, 11, 2, 7, 20, 17]])2 batch: batch_len: 2 elem_shape: batch: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: tensor([11709, 58322, 49677, 18878]) batch_len: 2 batch: [tensor([[2019, 10, 10, 11, 54, 1], [2019, 10, 28, 9, 41, 15], [2019, 11, 1, 12, 55, 38], [2019, 11, 2, 7, 20, 17]]), tensor([[2021, 10, 21, 10, 57, 1], [2021, 10, 28, 4, 41, 13], [2021, 10, 31, 14, 23, 40], [2021, 11, 3, 14, 10, 39]])] tensor([[2022, 1, 9, 4, 24, 57], [2022, 1, 11, 15, 1, 33], [2022, 1, 13, 17, 10, 21], [2022, 2, 24, 16, 4, 35]]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) [tensor([11709, 58322, 49677, 18878]), tensor([72797, 27520, 313, 9565])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) [tensor([[2022, 1, 9, 4, 24, 57], [2022, 1, 11, 15, 1, 33], [2022, 1, 13, 17, 10, 21], [2022, 2, 24, 16, 4, 35]]), tensor([[2021, 9, 11, 8, 53, 52], [2021, 10, 1, 7, 4, 45], [2021, 11, 1, 6, 57, 6], [2021, 11, 2, 12, 50, 55]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) tensor([23121, 94384, 64729, 63419]) batch_len: 2 batch: [tensor([23121, 94384, 64729, 63419]), tensor([68192, 22562, 91147, 53581])] elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: tensor([[2017, 6, 15, 15, 24, 56], [2018, 9, 2, 12, 31, 42], [2018, 12, 5, 13, 24, 18], [2018, 12, 30, 11, 41, 17]]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: [tensor([[2017, 6, 15, 15, 24, 56], [2018, 9, 2, 12, 31, 42], [2018, 12, 5, 13, 24, 18], [2018, 12, 30, 11, 41, 17]]), tensor([[2021, 10, 16, 21, 55, 55], [2021, 10, 17, 11, 25, 49], [2021, 10, 22, 12, 27, 43], [2021, 10, 24, 10, 50, 51]])] tensor([[2021, 8, 13, 1, 4, 58], [2021, 8, 29, 5, 0, 33], [2021, 8, 29, 8, 9, 38], [2021, 9, 15, 4, 46, 37]]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) [tensor([[2021, 8, 13, 1, 4, 58], [2021, 8, 29, 5, 0, 33], [2021, 8, 29, 8, 9, 38], [2021, 9, 15, 4, 46, 37]]), tensor([[2021, 5, 8, 11, 38, 37], [2021, 6, 6, 12, 34, 21], [2021, 9, 4, 12, 5, 10], [2021, 9, 16, 15, 41, 53]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: tensor([38585, 10559, 37311, 1179]) batch_len: 2 batch: [tensor([38585, 10559, 37311, 1179]), tensor([23749, 14811, 19801, 8735])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: tensor([67086, 15614, 41876, 65866]) batch_len: 2 batch: [tensor([67086, 15614, 41876, 65866]), tensor([78779, 84830, 13453, 29267])] elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: tensor([[2018, 11, 25, 8, 42, 38], [2019, 8, 20, 8, 15, 18], [2019, 9, 23, 9, 21, 23], [2019, 10, 15, 4, 48, 33]]) batch_len: 2 batch: [tensor([[2018, 11, 25, 8, 42, 38], [2019, 8, 20, 8, 15, 18], [2019, 9, 23, 9, 21, 23], [2019, 10, 15, 4, 48, 33]]), tensor([[2021, 12, 18, 7, 8, 29], [2022, 1, 23, 2, 26, 47], [2022, 2, 15, 8, 20, 36], [2022, 2, 28, 7, 13, 14]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([ 6662, 4673, 11970, 3575]) tensor([38929, 55609, 36272, 39413])batch_len: 2batch_len: batch: 2 batch: [tensor([38929, 55609, 36272, 39413]), tensor([10936, 63753, 14848, 14360])] [tensor([ 6662, 4673, 11970, 3575]), tensor([44828, 36185, 87193, 63846])] elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([32135, 93929, 34764, 69886]) batch_len: 2tensor([45581, 24052, 52267, 23602]) batch: batch_len: 2 batch: [tensor([32135, 93929, 34764, 69886]), tensor([54510, 83382, 65314, 46423])] [tensor([45581, 24052, 52267, 23602]), tensor([37862, 78032, 23472, 36266])] elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4, 6])elem_shape: elem: torch.Size([4]) elem: tensor([25029, 50735, 72231, 82790]) tensor([[2020, 6, 22, 4, 8, 58], [2020, 6, 24, 10, 35, 59], [2021, 4, 25, 13, 5, 40], [2021, 6, 30, 23, 12, 22]])batch_len: 2batch_len: elem_shape: batch: 2 batch: torch.Size([4]) elem: tensor([[2020, 4, 5, 0, 51, 10], [2020, 4, 25, 5, 5, 49], [2020, 4, 30, 14, 29, 13], [2020, 5, 4, 13, 33, 40]]) tensor([ 4706, 45075, 33534, 35891])batch_len: 2batch_len: [tensor([25029, 50735, 72231, 82790]), tensor([36757, 13822, 39746, 12628])]batch: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([ 4706, 45075, 33534, 35891]), tensor([14289, 39310, 54955, 22678])] tensor([ 8623, 49238, 74673, 30410]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: elem_shape: tensor([85332, 83015, 13114, 16238])[tensor([ 8623, 49238, 74673, 30410]), tensor([ 5683, 50161, 52544, 12805])]
batch_len: 2 batch: [tensor([[2020, 6, 22, 4, 8, 58], [2020, 6, 24, 10, 35, 59], [2021, 4, 25, 13, 5, 40], [2021, 6, 30, 23, 12, 22]]), tensor([[2020, 12, 13, 14, 34, 9], [2020, 12, 15, 10, 28, 9], [2020, 12, 16, 2, 21, 40], [2020, 12, 16, 11, 30, 10]])]torch.Size([4]) elem: elem_shape: [tensor([[2020, 4, 5, 0, 51, 10], [2020, 4, 25, 5, 5, 49], [2020, 4, 30, 14, 29, 13], [2020, 5, 4, 13, 33, 40]]), tensor([[2021, 3, 8, 14, 38, 4], [2021, 4, 3, 13, 49, 1], [2021, 4, 21, 14, 32, 33], [2021, 5, 2, 11, 25, 25]])]torch.Size([3]) elem: [tensor([85332, 83015, 13114, 16238]), tensor([94158, 39764, 91330, 69704])] tensor([ 6869, 8025, 14091, 70717])tensor([1, 1, 1])
batch_len: batch_len: 2 2batch: batch: elem_shape: torch.Size([3]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] tensor([1, 1, 1]) batch_len: 2 batch: elem_shape: [tensor([ 6869, 8025, 14091, 70717]), tensor([15109, 31352, 9331, 28637])] torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4, 6]) elem: tensor([41009, 62101, 75279, 8355]) batch_len: tensor([[2020, 9, 20, 16, 18, 6], [2020, 12, 11, 11, 11, 37], [2021, 6, 13, 14, 34, 1], [2021, 7, 13, 10, 17, 45]]) batch_len: 2 2batch: batch: [tensor([41009, 62101, 75279, 8355]), tensor([10073, 35383, 38809, 62264])]tensor([[2021, 6, 11, 9, 18, 6], [2021, 6, 29, 7, 45, 39], [2021, 7, 1, 5, 32, 32], [2021, 7, 8, 0, 9, 53]])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: [tensor([[2020, 9, 20, 16, 18, 6], [2020, 12, 11, 11, 11, 37], [2021, 6, 13, 14, 34, 1], [2021, 7, 13, 10, 17, 45]]), tensor([[2020, 4, 25, 6, 12, 24], [2020, 5, 10, 15, 33, 41], [2020, 8, 2, 17, 29, 52], [2021, 1, 30, 13, 57, 19]])] tensor([1, 1, 1]) batch_len: 2 batch: [tensor([[2021, 6, 11, 9, 18, 6], [2021, 6, 29, 7, 45, 39], [2021, 7, 1, 5, 32, 32], [2021, 7, 8, 0, 9, 53]]), tensor([[2020, 4, 5, 8, 51, 56], [2020, 4, 9, 3, 39, 43], [2020, 4, 13, 7, 54, 40], [2020, 4, 15, 8, 2, 15]])] elem_shape: torch.Size([4]) elem: elem_shape: [tensor([1, 1, 1]), tensor([1, 1, 1])] torch.Size([4]) elem: elem_shape: torch.Size([4, 6]) elem: tensor([29506, 3031, 2450, 16728]) batch_len: 2tensor([27363, 17546, 49356, 12892]) batch: batch_len: 2 batch: [tensor([27363, 17546, 49356, 12892]), tensor([58303, 86109, 48931, 5863])] [tensor([29506, 3031, 2450, 16728]), tensor([ 4410, 19712, 7912, 2837])] tensor([[2020, 3, 22, 6, 8, 58], [2020, 3, 23, 15, 57, 24], [2020, 4, 1, 17, 19, 52], [2020, 4, 23, 8, 40, 44]]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4]) elem: tensor([41751, 69507, 17961, 37770]) batch_len: 2 batch: tensor([40460, 86051, 1461, 24607]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([41751, 69507, 17961, 37770]), tensor([61301, 94601, 69568, 52381])] [tensor([40460, 86051, 1461, 24607]), tensor([69028, 81627, 80154, 46609])] tensor([10081, 38232, 20732, 2753]) batch_len: 2 batch: elem_shape: torch.Size([3]) elem: elem_shape: torch.Size([3]) elem: [tensor([10081, 38232, 20732, 2753]), tensor([69776, 64923, 40659, 53832])]tensor([1, 1, 1])
batch_len: [tensor([[2020, 3, 22, 6, 8, 58], [2020, 3, 23, 15, 57, 24], [2020, 4, 1, 17, 19, 52], [2020, 4, 23, 8, 40, 44]]), tensor([[2021, 12, 10, 14, 1, 52], [2022, 1, 21, 2, 55, 46], [2022, 1, 30, 9, 57, 52], [2022, 4, 8, 5, 5, 32]])]2 batch: tensor([1, 1, 1]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] tensor([80933, 54961, 71171, 20766]) batch_len: 2 batch: elem_shape: torch.Size([4, 6])[tensor([1, 1, 1]), tensor([1, 1, 1])] elem: elem_shape: torch.Size([4, 6]) elem: [tensor([80933, 54961, 71171, 20766]), tensor([27471, 12016, 60008, 31466])] tensor([[2021, 4, 16, 18, 52, 24], [2021, 4, 24, 4, 1, 15], [2021, 4, 28, 12, 34, 40], [2021, 5, 27, 14, 53, 32]]) batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([[2022, 1, 19, 8, 21, 4], [2022, 1, 25, 13, 21, 57], [2022, 1, 25, 13, 26, 8], [2022, 1, 25, 14, 37, 19]]) batch_len: 2 batch: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4, 6]) elem: [tensor([[2021, 4, 16, 18, 52, 24], [2021, 4, 24, 4, 1, 15], [2021, 4, 28, 12, 34, 40], [2021, 5, 27, 14, 53, 32]]), tensor([[2022, 2, 11, 10, 37, 8], [2022, 2, 14, 9, 58, 15], [2022, 2, 15, 10, 35, 35], [2022, 2, 25, 11, 35, 51]])] [tensor([[2022, 1, 19, 8, 21, 4], [2022, 1, 25, 13, 21, 57], [2022, 1, 25, 13, 26, 8], [2022, 1, 25, 14, 37, 19]]), tensor([[2021, 8, 8, 4, 48, 52], [2021, 8, 9, 11, 59, 3], [2021, 8, 11, 2, 36, 48], [2021, 8, 15, 9, 18, 21]])] tensor([[2019, 12, 21, 0, 20, 2], [2019, 12, 24, 4, 56, 0], [2020, 6, 3, 14, 42, 35], [2020, 6, 7, 12, 34, 55]]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([[2019, 12, 21, 0, 20, 2], [2019, 12, 24, 4, 56, 0], [2020, 6, 3, 14, 42, 35], [2020, 6, 7, 12, 34, 55]]), tensor([[2021, 8, 5, 16, 34, 44], [2021, 8, 10, 10, 42, 13], [2021, 8, 15, 5, 23, 11], [2021, 8, 25, 5, 38, 34]])] tensor([25996, 21510, 15073, 22007]) batch_len: 2 batch: [tensor([25996, 21510, 15073, 22007]), tensor([20878, 33688, 13673, 23473])] elem_shape: torch.Size([4]) elem: tensor([54102, 80143, 42441, 14827]) batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() storage = elem.storage()._new_shared(numel) [tensor([54102, 80143, 42441, 14827]), tensor([64859, 75260, 51319, 27606])] elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: tensor([89791, 90644, 92446, 54050]) batch_len: 2 batch: tensor([65655, 37454, 50466, 48762]) batch_len: 2 batch: [tensor([89791, 90644, 92446, 54050]), tensor([56544, 83106, 41060, 1607])] [tensor([65655, 37454, 50466, 48762]), tensor([60052, 200, 61482, 89754])] elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: elem_shape: torch.Size([3]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] tensor([1, 1, 1]) batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 6, 4, 10, 56, 45], [2021, 6, 9, 4, 12, 42], [2021, 6, 13, 8, 18, 31], [2021, 7, 10, 15, 46, 59]]) batch_len: 2 batch: tensor([[2021, 5, 26, 8, 51, 21], [2021, 6, 2, 5, 45, 4], [2021, 6, 2, 10, 47, 17], [2021, 6, 14, 23, 51, 24]]) batch_len: 2 batch: [tensor([[2021, 6, 4, 10, 56, 45], [2021, 6, 9, 4, 12, 42], [2021, 6, 13, 8, 18, 31], [2021, 7, 10, 15, 46, 59]]), tensor([[2018, 11, 16, 15, 22, 34], [2019, 4, 6, 18, 8, 30], [2019, 4, 19, 9, 34, 45], [2019, 4, 30, 13, 9, 4]])] [tensor([[2021, 5, 26, 8, 51, 21], [2021, 6, 2, 5, 45, 4], [2021, 6, 2, 10, 47, 17], [2021, 6, 14, 23, 51, 24]]), tensor([[2020, 8, 29, 12, 27, 53], [2020, 9, 4, 3, 29, 46], [2020, 9, 16, 12, 10, 11], [2020, 9, 16, 12, 15, 40]])] /data/home/xconnorwang/HLLM/code/REC/data/dataset/collatefn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.) return torch.stack(batch, 0, out=out) elem_shape: torch.Size([4]) elem: tensor([54514, 27056, 33162, 20521]) batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([54514, 27056, 33162, 20521]), tensor([70133, 32499, 25117, 42046])] elem_shape: torch.Size([4]) tensor([49495, 10004, 10945, 30683])elem:
run_loop(local_rank=local_rank, config_file=config_file, extra_args=extra_args)
File "/data/home/xconnorwang/HLLM/code/run.py", line 110, in run_loop
best_valid_score, best_valid_result = trainer.fit(
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 342, in fit
train_loss = self._train_epoch(train_data, epoch_idx, show_progress=show_progress)
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 198, in _train_epoch
self.lite.backward(losses)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/fabric.py", line 446, in backward
self._strategy.backward(tensor, module, *args, kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/strategies/strategy.py", line 188, in backward
self.precision.backward(tensor, module, *args, *kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/plugins/precision/deepspeed.py", line 91, in backward
model.backward(tensor, args, kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, *kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1976, in backward
self.optimizer.backward(loss, retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/zero/stage3.py", line 2213, in backward
self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward
scaled_loss.backward(retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1
batch_len: 2 batch: tensor([95307, 57247, 49990, 17365]) batch_len: 2 batch: [tensor([95307, 57247, 49990, 17365]), tensor([75868, 79781, 39034, 12100])] [tensor([49495, 10004, 10945, 30683]), tensor([ 2930, 15228, 46649, 36206])] elem_shape: torch.Size([3]) elem: elem_shape: torch.Size([4]) elem: tensor([1, 1, 1]) batch_len: 2 batch: tensor([75048, 81460, 72950, 55131]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] [tensor([75048, 81460, 72950, 55131]), tensor([76877, 71803, 95754, 64344])] elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] tensor([[2020, 9, 25, 5, 37, 22], [2020, 12, 16, 14, 21, 19], [2021, 4, 7, 11, 43, 34], [2021, 4, 12, 10, 31, 36]]) batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 2, 26, 14, 46, 20], [2021, 4, 5, 12, 24, 38], [2021, 4, 5, 13, 3, 52], [2021, 5, 15, 17, 1, 36]]) batch_len: 2 batch: [tensor([[2020, 9, 25, 5, 37, 22], [2020, 12, 16, 14, 21, 19], [2021, 4, 7, 11, 43, 34], [2021, 4, 12, 10, 31, 36]]), tensor([[2019, 2, 10, 7, 8, 58], [2019, 2, 10, 7, 26, 39], [2019, 4, 3, 10, 11, 41], [2019, 4, 11, 9, 40, 53]])] [tensor([[2021, 2, 26, 14, 46, 20], [2021, 4, 5, 12, 24, 38], [2021, 4, 5, 13, 3, 52], [2021, 5, 15, 17, 1, 36]]), tensor([[2021, 8, 30, 10, 47, 40], [2021, 9, 28, 12, 1, 47], [2021, 9, 29, 16, 14, 55], [2021, 10, 1, 7, 18, 49]])] elem_shape: torch.Size([4]) elem: tensor([45360, 536, 48045, 26275]) batch_len: 2 batch: [tensor([45360, 536, 48045, 26275]), tensor([20411, 11698, 21909, 15291])] elem_shape: torch.Size([4]) elem: tensor([79237, 53705, 87849, 26910]) batch_len: 2 batch: [tensor([79237, 53705, 87849, 26910]), tensor([13761, 19517, 61800, 59291])] elem_shape: torch.Size([3]) elem: tensor([1, 1, 1]) batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])] elem_shape: torch.Size([4, 6]) elem: tensor([[2022, 2, 11, 1, 20, 13], [2022, 2, 11, 1, 23, 9], [2022, 2, 11, 2, 3, 3], [2022, 2, 11, 6, 13, 31]]) batch_len: 2 batch: [tensor([[2022, 2, 11, 1, 20, 13], [2022, 2, 11, 1, 23, 9], [2022, 2, 11, 2, 3, 3], [2022, 2, 11, 6, 13, 31]]), tensor([[2020, 7, 18, 15, 24, 36], [2020, 8, 2, 16, 48, 4], [2020, 10, 10, 17, 13, 9], [2020, 11, 24, 6, 57, 41]])] Traceback (most recent call last): File "/data/home/xconnorwang/HLLM/code/run.py", line 139, in
Train [ 0/ 5]: 0%| | 0/408043 [00:05<?, ?it/s] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1073205) of binary: /usr/local/python3/bin/python3.9 Traceback (most recent call last): File "/data/home/xconnorwang/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
run.py FAILED
Failures: