In Quantized Bonito Tutorial .The following error occurred in the last step
Fetching 10 files: 100%
10/10 [00:00<00:00, 415.21it/s]
Replacing layers...: 100%|██████████| 32/32 [00:10<00:00, 3.06it/s]
Fusing layers...: 100%|██████████| 32/32 [00:00<00:00, 92.50it/s]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:2 for open-end generation.
AssertionError Traceback (most recent call last)
in ()
12 # Generate synthetic instruction tuning dataset
13 sampling_params = {'max_new_tokens':256, 'top_p':0.95, 'temperature':0.5, 'num_return_sequences':1}
---> 14 synthetic_dataset = bonito.generate_tasks(
15 unannotated_text,
16 context_col="input",
18 frames
/usr/local/lib/python3.10/dist-packages/awq/modules/fused/norm.py in forward(self, x)
17
18 def forward(self, x):
---> 19 assert AWQ_INSTALLED, (
20 "AWQ kernels could not be loaded. "
21 "Please install them from https://github.com/casper-hansen/AutoAWQ_kernels"
AssertionError: AWQ kernels could not be loaded. Please install them from https://github.com/casper-hansen/AutoAWQ_kernels
In Quantized Bonito Tutorial .The following error occurred in the last step Fetching 10 files: 100% 10/10 [00:00<00:00, 415.21it/s] Replacing layers...: 100%|██████████| 32/32 [00:10<00:00, 3.06it/s] Fusing layers...: 100%|██████████| 32/32 [00:00<00:00, 92.50it/s] The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's
attention_mask
to obtain reliable results. Settingpad_token_id
toeos_token_id
:2 for open-end generation.AssertionError Traceback (most recent call last)