Closed robotzheng closed 3 weeks ago
Please, help me! Thanks.
we use GPT2 tokenizer, will this lead to ’nan‘? flashattention2
Hi, currently, BitBLAS does not implement the backward computation, and cannot be used in training if there is no additional backward implementation. Therefore, gradients will not be processed and may result in NaN. You may try torch.nn.Linear or torch.matmul to see training results.
Thanks a lot.
closed as the question has been answered.
zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate2:0.006 zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate2:0.006 zzt-learning_rate2:0.006 Overriding config with config/scale_gpt.py: Overriding config with config/scale_gpt.py:
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
config for scaling GPT following Kaplan et al.
wandb_log = True wandb_project = 'owt-scaling' wandb_run_id = "" # give only when resuming a W&B run always_save_checkpoint = False
setting default values of scale_N, scale_D to False, you must change them from command line when scaling.
scaling = "Kaplan" scale_N = False scale_D = False
replace n_layer, n_embd and fraction_of_data from command line. Default values:
n_layer = 12 n_embd = 768 fraction_of_data = 1.0
Can also set n_head from command line, but set default value through this rule of thumb
given in Appendix F of 2010.14701
It is consistent with nanoGPT where n_embd = 768, and n_head = 768//64 = 12.
n_head = 12 #max(2, n_embd // 64)
TRAINING CONFIGURATIONS FROM KAPLAN ET AL
total batch size = 512 so set local batch size = 16, gradaccum = 32
batch_size = 16 block_size = 1024 gradient_accumulation_steps = 4 * 8
total number of training iterations = 2.5e5
learning rate warms up for 3000 iterations and decays to 0 at the end of training.
dropout = 0.1 (see Section 4.2). minimum learning rate is 0
maximum learning rate is given by equation D.1 of the paper. It depends on N, so we set it in configurator.py
max_iters = int(2.5e5) warmup_iters = 3000 lr_decay_iters = int(2.5e5) dropout = 0.0 min_lr = 0 learning_rate = 6e-3
eval stuff same as nanoGPT
eval_interval = 1000 eval_iters = 200 log_interval = 10
weight decay same as nanoGPT
weight_decay = 1e-1
Overriding: scale_N = True Overriding: scale_N = True Overriding: n_layer = 12 Overriding: n_layer = 12 Overriding: n_embd = 768 Overriding: n_embd = 768 zzt-before-scale_N:0.006 zzt-before-scale_N:0.006 zzt-learning_rate3:0.006 zzt-learning_rate3:0.006 tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via the
with torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via the
with torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:24:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via the
with torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) tokens per iteration will be: 524,288 zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }
tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }
tokens per iteration will be: 524,288 tokens per iteration will be: 524,288 Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) Initializing a new model from scratch defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency) zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }
zzt:BitnetConfig { "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "hidden_act": "silu", "hidden_size": 768, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 2048, "max_position_embeddings": 1024, "model_type": "llama", "num_attention_heads": 12, "num_hidden_layers": 12, "num_key_value_heads": 12, "pretraining_tp": 1, "rms_norm_eps": 1e-06, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.38.1", "use_cache": true, "vocab_size": 50304, "weight_bits": 1 }
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via the
with torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetForCausalLM is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in BitnetModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:24:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 BitBLAS Operator created. 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:25:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt:flash_attention_2 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:25:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:25:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created.2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the databaseBitBLAS Operator created. 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:26:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:26:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:26:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:26:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:26:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:26:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created.BitBLAS Operator created.
2024-05-23 06:27:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:18 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:19 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:20 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:21 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:22 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:23 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:24 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:25 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:27:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:26 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:27 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:28 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:29 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:30 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:31 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:32 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:33 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:34 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:35 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:36 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:37 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:38 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:39 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:40 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created.BitBLAS Operator created.
zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:27:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:41 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:42 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:43 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:44 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:45 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:46 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:47 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:48 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:27:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:49 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:50 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:51 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:52 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:53 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:27:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:27:54 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:55 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:56 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:57 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:27:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:58 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:27:59 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:00 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:01 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:02 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:03 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt:flash_attention_2 zzt-BitnetAttention-head_dim:64 zzt-BitnetAttention-hidden_size:768 zzt-BitnetAttention-num_heads:12 2024-05-23 06:28:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:04 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:28:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:05 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:06 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:07 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:08 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:28:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:09 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created.BitBLAS Operator created.
BitBLAS Operator created. 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:10 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. 2024-05-23 06:28:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database zzt-BitnetFlashAttention2 2024-05-23 06:28:11 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:28:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. zzt-BitnetFlashAttention2 zzt-BitnetFlashAttention2 2024-05-23 06:28:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database 2024-05-23 06:28:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:28:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:28:12 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:28:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. zzt-BitnetFlashAttention2 2024-05-23 06:28:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:13 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:14 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:15 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:16 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. 2024-05-23 06:28:17 [BitBLAS:INFO]: Database path /home/notebook/code/personal/80306170/AGI/BitNet/cache/bitblas does not exist, skipping loading operators from the database BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. BitBLAS Operator created. number of parameters: 123.62M BitBLAS Operator created. num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True BitBLAS Operator created. BitBLAS Operator created. number of parameters: 123.62M BitBLAS Operator created. number of parameters: 123.62M num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True number of parameters: 123.62M num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True number of parameters: 123.62M number of parameters: 123.62M num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters number of parameters: 123.62M using fused AdamW: True num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True number of parameters: 123.62M num decayed parameter tensors: 86, with 162,201,600 parameters num non-decayed parameter tensors: 49, with 52,992 parameters using fused AdamW: True wandb: Currently logged in as: robotzheng. Use
wandb login --relogin
to force relogin wandb: wandb version 0.17.0 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.16.3 wandb: Run data is saved locally in /home/notebook/code/personal/80306170/AGI/BitNet/scaling_laws_bitnet/wandb/run-20240523_062827-qa8l01al wandb: Runwandb offline
to turn off syncing. wandb: Syncing run N-8.49e+07 wandb: ⭐️ View project at https://wandb.ai/robotzheng/owt-scaling wandb: 🚀 View run at https://wandb.ai/robotzheng/owt-scaling/runs/qa8l01al step 0: train loss nan, val loss nan iter 0: loss nan, time 72417.82ms, mfu -100.00% iter 10: loss nan, time 1514.87ms, mfu 11.86% iter 20: loss nan, time 1514.67ms, mfu 11.86% iter 30: loss nan, time 1514.11ms, mfu 11.86% iter 40: loss nan, time 1514.20ms, mfu 11.86% iter 50: loss nan, time 1514.77ms, mfu 11.86% iter 60: loss nan, time 1515.29ms, mfu 11.86% iter 70: loss nan, time 1515.12ms, mfu 11.86% iter 80: loss nan, time 1514.51ms, mfu 11.86% iter 90: loss nan, time 1515.22ms, mfu 11.86% iter 100: loss nan, time 1514.65ms, mfu 11.86% iter 110: loss nan, time 1514.52ms, mfu 11.86% iter 120: loss nan, time 1514.85ms, mfu 11.86% iter 130: loss nan, time 1514.39ms, mfu 11.86% iter 140: loss nan, time 1514.85ms, mfu 11.86% iter 150: loss nan, time 1514.49ms, mfu 11.86%train.py:
""" This training script can be run both on a single gpu in debug mode, and also in a larger training run with distributed data parallel (ddp).
To run on a single GPU, example: $ python train.py --batch_size=32 --compile=False
To run with DDP on 4 gpus on 1 node, example: $ torchrun --standalone --nproc_per_node=4 train.py
To run with DDP on 4 gpus across 2 nodes, example:
import os import time import math import pickle from contextlib import nullcontext
import numpy as np import torch from torch.nn.parallel import DistributedDataParallel as DDP from torch.distributed import init_process_group, destroy_process_group import torch._dynamo torch._dynamo.config.suppress_errors = True torch._dynamo.config.cache_size_limit = 64
from model import GPTConfig, GPT from modeling_bitnet import BitnetForCausalLM from tokenization_bitnet import BitnetTokenizer from configuration_bitnet import BitnetConfig
-----------------------------------------------------------------------------
default config values designed to train a gpt2 (124M) on OpenWebText
I/O
out_dir = 'out' eval_interval = 2000 log_interval = 1 eval_iters = 200 eval_only = False # if True, script exits right after the first eval always_save_checkpoint = True # if True, always save a checkpoint after each eval init_from = 'scratch' # 'scratch' or 'resume' or 'gpt2*'
wandb logging
wandb_log = False # disabled by default wandb_project = 'owt' wandb_run_name = 'gpt2' # 'run' + str(time.time())
data
dataset = 'openwebtext' gradient_accumulation_steps = 5 * 8 # used to simulate larger batch sizes batch_size = 1 # if gradient_accumulation_steps > 1, this is the micro-batch size block_size = 1024
model
n_layer = 12 n_head = 12 n_embd = 768 n_intermediate_size = 2048 dropout = 0.0 # for pretraining 0 is good, for finetuning try 0.1+ bias = False # do we use bias inside LayerNorm and Linear layers?
adamw optimizer
learning_rate = 6e-3 # max learning rate max_iters = 600000 # total number of training iterations weight_decay = 1e-1 beta1 = 0.9 beta2 = 0.95 grad_clip = 1.0 # clip gradients at this value, or disable if == 0.0
learning rate decay settings
decay_lr = True # whether to decay the learning rate warmup_iters = 2000 # how many steps to warm up for lr_decay_iters = 600000 # should be ~= max_iters per Chinchilla min_lr = 0 # minimum learning rate, should be ~= learning_rate/10 per Chinchilla
DDP settings
backend = 'nccl' # 'nccl', 'gloo', etc.
system
device = 'cuda' # examples: 'cpu', 'cuda', 'cuda:0', 'cuda:1' etc., or try 'mps' on macbooks dtype = 'bfloat16' if torch.cuda.is_available() and torch.cuda.is_bf16_supported() else 'float16' # 'float32', 'bfloat16', or 'float16', the latter will auto implement a GradScaler compile = False # use PyTorch 2.0 to compile the model to be faster
variables needed for scaling laws
scaling = "" # takes 4 values: Kaplan, Chinchilla-1, Chinchilla-2 or '' (default, when not scaling). scale_N = False scale_D = False estimate_B_crit = False N = 12 n_layer n_embd*2 # number of non-embedding parameters fraction_of_data = 1.0 # fraction of OWT dataset that will be used for training D = int(fraction_of_data9035582198) # number of OWT dataset tokens wandb_run_id = "" # needed only if resuming a W&B run.
-----------------------------------------------------------------------------
configkeys = [k for k,v in globals().items() if not k.startswith('') and isinstance(v, (int, float, bool, str))] print(f'zzt-learning_rate2:{learning_rate}') exec(open('configurator.py').read()) # overrides from command line or config file print(f'zzt-learning_rate3:{learning_rate}') config = {k: globals()[k] for k in config_keys} # will be useful for logging
-----------------------------------------------------------------------------
various inits, derived attributes, I/O setup
ddp = int(os.environ.get('RANK', -1)) != -1 # is this a ddp run? if ddp: init_process_group(backend=backend) ddp_rank = int(os.environ['RANK']) ddp_local_rank = int(os.environ['LOCAL_RANK']) ddp_world_size = int(os.environ['WORLD_SIZE']) device = f'cuda:{ddp_local_rank}' torch.cuda.set_device(device) master_process = ddp_rank == 0 # this process will do logging, checkpointing etc. seed_offset = ddp_rank # each process gets a different seed
world_size number of processes will be training simultaneously, so we can scale
else:
if not ddp, we are running on a single gpu, and one process
tokens_per_iter = gradient_accumulation_steps ddp_world_size batch_size * block_size print(f"tokens per iteration will be: {tokens_per_iter:,}")
if master_process: os.makedirs(out_dir, exist_ok=True) torch.manual_seed(1337 + seed_offset) torch.backends.cuda.matmul.allow_tf32 = True # allow tf32 on matmul torch.backends.cudnn.allow_tf32 = True # allow tf32 on cudnn device_type = 'cuda' if 'cuda' in device else 'cpu' # for later use in torch.autocast
note: float16 data type will automatically use a GradScaler
ptdtype = {'float32': torch.float32, 'bfloat16': torch.bfloat16, 'float16': torch.float16}[dtype] ctx = nullcontext() if device_type == 'cpu' else torch.amp.autocast(device_type=device_type, dtype=ptdtype)
poor man's data loader
data_dir = os.path.join('data', dataset) train_data = np.memmap(os.path.join(data_dir, 'train.bin'), dtype=np.uint16, mode='r') if scaling == 'Kaplan' and scale_D: # use a fraction of dataset if scaling with dataset size following Kaplan et al
train_data = train_data[:D] zzt
val_data = np.memmap(os.path.join(data_dir, 'val.bin'), dtype=np.uint16, mode='r') def get_batch(split):
print(f'zzt-batch_size:{batch_size}')
init these up here, can override if init_from='resume' (i.e. from a checkpoint)
iter_num = 0 best_val_loss = 1e9
attempt to derive vocab_size from the dataset
meta_path = os.path.join(data_dir, 'meta.pkl') meta_vocab_size = None if os.path.exists(meta_path): with open(meta_path, 'rb') as f: meta = pickle.load(f) meta_vocab_size = meta['vocab_size'] print(f"found vocab_size = {meta_vocab_size} (inside {meta_path})")
model init
model_args = dict(num_hidden_layers=n_layer, num_attention_heads=n_head, hidden_size=n_embd, intermediate_size=n_intermediate_size, max_position_embeddings=block_size, attention_bias=bias, vocab_size=None, attention_dropout=dropout) # start with model_args from command line if init_from == 'scratch':
init a new model from scratch
elif init_from == 'resume': print(f"Resuming training from {out_dir}")
resume training from a checkpoint.
elif init_from.startswith('gpt2'): print(f"Initializing from OpenAI GPT-2 weights: {init_from}")
initialize from OpenAI GPT-2 weights
crop down the model block size if desired, using model surgery
if block_size < model.config.max_position_embeddings: model.crop_block_size(block_size) model_args['max_position_embeddings'] = block_size # so that the checkpoint will have the right value model.to(device)
initialize a GradScaler. If enabled=False scaler is a no-op
scaler = torch.cuda.amp.GradScaler(enabled=(dtype == 'float16'))
optimizer
optimizer = model.configure_optimizers(weight_decay, learning_rate, (beta1, beta2), device_type) if init_from == 'resume': optimizer.load_state_dict(checkpoint['optimizer']) checkpoint = None # free up memory
compile the model
if compile: print("compiling the model... (takes a ~minute)") unoptimized_model = model model = torch.compile(model) # requires PyTorch 2.0
wrap model into DDP container
if ddp: model = DDP(model, device_ids=[ddp_local_rank], find_unused_parameters=True)
helps estimate an arbitrarily accurate loss over either split using many batches
@torch.no_grad() def estimate_loss(): out = {} model.eval() for split in ['train', 'val']: losses = torch.zeros(eval_iters) for k in range(eval_iters): X, Y = get_batch(split) with ctx:
print('zzt:ctx')
learning rate decay scheduler (cosine with warmup)
def get_lr(it):
1) linear warmup for warmup_iters steps
logging
if wandb_log and master_process: import wandb if wandb_run_id: # resume a previous run with id=wandb_run_id wandb.init(project=wandb_project, id=wandb_run_id, resume="must") else: wandb.init(project=wandb_project, name=wandb_run_name, config=config)
training loop
X, Y = get_batch('train') # fetch the very first batch t0 = time.time() local_iter_num = 0 # number of iterations in the lifetime of this process raw_model = model.module if ddp else model # unwrap DDP container if needed running_mfu = -1.0 while True:
if ddp: destroy_process_group()