Closed sxk000 closed 6 months ago
运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
zero3换成zero2可以正常运行了,具体是一下命令:
运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero2
@sxk000 是的! QLoRA 与 zero3 暂不兼容,可以考虑 QLoRA -> LoRA,或 zero3 -> zero2。
internlm2_20b_qlora_msagent_react_e3_gpu8
上面是qlora训练方式,如果全参训练的话,
1,需要对脚本做一下改动:
#######################################################################
# PART 2 Model & Tokenizer #
#######################################################################
tokenizer = dict(
type=AutoTokenizer.from_pretrained,
pretrained_model_name_or_path=pretrained_model_name_or_path,
trust_remote_code=True,
padding_side='right')
# model = dict(
# type=SupervisedFinetune,
# llm=dict(
# type=AutoModelForCausalLM.from_pretrained,
# pretrained_model_name_or_path=pretrained_model_name_or_path,
# trust_remote_code=True,
# torch_dtype=torch.float16,
# quantization_config=dict(
# type=BitsAndBytesConfig,
# load_in_4bit=True,
# load_in_8bit=False,
# llm_int8_threshold=6.0,
# llm_int8_has_fp16_weight=False,
# bnb_4bit_compute_dtype=torch.float16,
# bnb_4bit_use_double_quant=True,
# bnb_4bit_quant_type='nf4')),
# lora=dict(
# type=LoraConfig,
# r=64,
# lora_alpha=16,
# lora_dropout=0.1,
# bias='none',
# task_type='CAUSAL_LM'))
model = dict(
type=SupervisedFinetune,
llm=dict(
type=AutoModelForCausalLM.from_pretrained,
pretrained_model_name_or_path=pretrained_model_name_or_path,
trust_remote_code=True,
torch_dtype=torch.float16))
2,运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
是可以正常运行的
@sxk000 是的! QLoRA 与 zero3 暂不兼容,可以考虑 QLoRA -> LoRA,或 zero3 -> zero2。
好的,谢谢!
@LZHgrla 你好!
如果把样例数据下载到本地文件夹后,加载训练,报错
# Data
# data_path = 'damo/MSAgent-Bench'
data_path = '/apply/data/finetune/MSAgent-Bench'
运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
报错完整日志: 报错日志313.log
报错日志部分信息:
2024-03-13 09:38:48,013 - modelscope - INFO - PyTorch version 2.2.1 Found.
2024-03-13 09:38:48,013 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-03-13 09:38:48,038 - modelscope - INFO - Loading done! Current index file version is 1.13.0, with md5 53769789b84c28d4871ea93addb53e8a and a total number of 972 components indexed
03/13 09:38:48 - mmengine - INFO - xtuner_dataset_timeout = 0:30:00
Map (num_proc=32): 0%| | 0/598185 [00:00<?, ? examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 623, in _write_generator_to_queue
for i, result in enumerate(func(**kwargs)):
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3458, in _map_single
example = apply_function_on_filtered_inputs(example, i, offset=offset)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3361, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/dataset/map_fns/dataset_map_fns/msagent_map_fn.py", line 54, in msagent_react_map_fn
text = eval(example['conversations'])
TypeError: eval() arg 1 must be a string, bytes or code object
"""
The above exception was the direct cause of the following exception:
请问这个问题应该如何解决呢?
谢谢!
@sxk000 Hi
该 PR 将修复这一问题
470
你好!
按照上面写的修复代码修改后,报如下错误:
after_run:
(BELOW_NORMAL) LoggerHook
--------------------
2024-03-13 14:10:37,942 - modelscope - INFO - PyTorch version 2.2.1 Found.
2024-03-13 14:10:37,942 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-03-13 14:10:38,078 - modelscope - INFO - Loading done! Current index file version is 1.13.0, with md5 53769789b84c28d4871ea93addb53e8a and a total number of 972 components indexed
03/13 14:10:38 - mmengine - INFO - xtuner_dataset_timeout = 0:30:00
Map (num_proc=32): 2%|█▎ | 9450/598185 [00:01<01:31, 6446.07 examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 623, in _write_generator_to_queue
for i, result in enumerate(func(**kwargs)):
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3458, in _map_single
example = apply_function_on_filtered_inputs(example, i, offset=offset)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3361, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/dataset/map_fns/dataset_map_fns/msagent_map_fn.py", line 67, in msagent_react_map_fn
api_dict[obj['name']] = obj['description']
TypeError: unhashable type: 'list'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
整个报错日志文件: 报错日志313.log
应该怎么解决呢?
稍等 我检查一下 @sxk000
@sxk000 Hi 该 PR 将修复这一问题
470
你好!
按照上面写的修复代码修改后,报如下错误:
after_run: (BELOW_NORMAL) LoggerHook -------------------- 2024-03-13 14:10:37,942 - modelscope - INFO - PyTorch version 2.2.1 Found. 2024-03-13 14:10:37,942 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2024-03-13 14:10:38,078 - modelscope - INFO - Loading done! Current index file version is 1.13.0, with md5 53769789b84c28d4871ea93addb53e8a and a total number of 972 components indexed 03/13 14:10:38 - mmengine - INFO - xtuner_dataset_timeout = 0:30:00 Map (num_proc=32): 2%|█▎ | 9450/598185 [00:01<01:31, 6446.07 examples/s] multiprocess.pool.RemoteTraceback: """ Traceback (most recent call last): File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 623, in _write_generator_to_queue for i, result in enumerate(func(**kwargs)): File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3458, in _map_single example = apply_function_on_filtered_inputs(example, i, offset=offset) File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3361, in apply_function_on_filtered_inputs processed_inputs = function(*fn_args, *additional_args, **fn_kwargs) File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/dataset/map_fns/dataset_map_fns/msagent_map_fn.py", line 67, in msagent_react_map_fn api_dict[obj['name']] = obj['description'] TypeError: unhashable type: 'list' """ The above exception was the direct cause of the following exception: Traceback (most recent call last):
整个报错日志文件: 报错日志313.log
应该怎么解决呢?
之前的一个PR对脏数据做了额外的处理,进而导致了这个错误。
我在 https://github.com/InternLM/xtuner/pull/470 中更新了代码,应该可以解决这个问题
@LZHgrla 你好!
按照新更新后的代码可以跑通脚本了!
然后替换自己整理的数据跑脚本时报错:
自己整理的部分训练数据(10条): train.json
完整的报错日志: 报错日志313.log
报错日志部分信息如下:
return cls(*args, begin=begin, end=end, by_epoch=by_epoch, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 787, in build_iter_from_epoch
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/lr_scheduler.py", line 20, in __init__
return cls(*args, begin=begin, end=end, by_epoch=by_epoch, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/lr_scheduler.py", line 20, in __init__
PARAM_SCHEDULERS.build(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
super().__init__(optimizer, 'lr', *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 759, in __init__
super().__init__(optimizer, 'lr', *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 759, in __init__
return cls(*args, begin=begin, end=end, by_epoch=by_epoch, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/lr_scheduler.py", line 20, in __init__
return cls(*args, begin=begin, end=end, by_epoch=by_epoch, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/lr_scheduler.py", line 20, in __init__
super().__init__(optimizer, 'lr', *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 759, in __init__
return cls(*args, begin=begin, end=end, by_epoch=by_epoch, **kwargs) return self.build_func(cfg, *args, **kwargs, registry=self)
super().__init__(optimizer, 'lr', *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/lr_scheduler.py", line 20, in __init__
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 294, in build_scheduler_from_cfg
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 759, in __init__
super().__init__(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 68, in __init__
super().__init__(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 68, in __init__
super().__init__(optimizer, 'lr', *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/optim/scheduler/param_scheduler.py", line 759, in __init__
raise ValueError('end should be larger than begin, but got'
ValueErrorraise ValueError('end should be larger than begin, but got':
end should be larger than begin, but got begin=0, end=0
ValueErrorreturn scheduler_cls.build_iter_from_epoch( # type: ignore:
end should be larger than begin, but got begin=0, end=0
这个应该如何解决呢?
非常感谢!
@sxk000 数据太少,不够warmup的iter数,所以报了这个错误。扩增数据可以解决,或者参考下面的代码关闭warmup
param_scheduler = [
- dict(
- type=LinearLR,
- start_factor=1e-5,
- by_epoch=True,
- begin=0,
- end=warmup_ratio * max_epochs,
- convert_to_iter_based=True),
dict(
type=CosineAnnealingLR,
eta_min=0.0,
by_epoch=True,
- begin=warmup_ratio * max_epochs,
+ begin=0,
end=max_epochs,
convert_to_iter_based=True)
]
@sxk000 数据太少,不够warmup的iter数,所以报了这个错误。扩增数据可以解决,或者参考下面的代码关闭warmup
param_scheduler = [ - dict( - type=LinearLR, - start_factor=1e-5, - by_epoch=True, - begin=0, - end=warmup_ratio * max_epochs, - convert_to_iter_based=True), dict( type=CosineAnnealingLR, eta_min=0.0, by_epoch=True, - begin=warmup_ratio * max_epochs, + begin=0, end=max_epochs, convert_to_iter_based=True) ]
按照上面的代码修改,现在可以跑通了!再次感谢耐心解答!
@LZHgrla
你好!
我用自己新整理的数据跑脚本时,报新的错误:KeyError: 'Column length not in the dataset. Current columns in the dataset: []'
自己整理的数据样例:数据样例.json
报错日志: 报错日志.log
报错日志部分信息:
Traceback (most recent call last):
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/tools/train.py", line 307, in
这个应该如何解决呢?
非常感谢!
@sxk000 可能是数据太长了,超过了max_length的限制,然后数据都被剔除了,得把config里的max_length调大一些
max_length
这个调整了:max_length = 8192
整理的数据样本长度平均在3000个字符左右,没有超出限制
@sxk000
追踪一下这段代码,看何时数据集被清空了
你好
print('---------- 1 ----------')
# Extract the useful data for training from the original dataset.
if dataset_map_fn is not None:
print('---------- 2 ----------')
dataset = map_dataset(dataset, dataset_map_fn, map_num_proc)
print('---------- 3 ----------')
print('---------- 4 ----------')
# Add prompt template, such as <|System|>: xxx <|User|>: xxx <|Bot|>: xxx
if template_map_fn is not None:
print('---------- 5 ----------')
dataset = add_template_to_dataset(dataset, template_map_fn,
map_num_proc)
print('---------- 6 ----------')
print('---------- 7 ----------')
for old, new in rename_maps:
print('---------- 8 ----------')
dataset = dataset.rename_column(old, new)
print('---------- 9 ----------')
print('---------- 10 ----------')
8和9没有走,其他的都有走到了
@sxk000 可以打印一下print(dataset)
看一下
@sxk000 可以打印一下
print(dataset)
看一下
下面是代码跟踪地方:
print('---------- 1 ----------')
print(dataset)
# Extract the useful data for training from the original dataset.
if dataset_map_fn is not None:
print('---------- 2 ----------')
dataset = map_dataset(dataset, dataset_map_fn, map_num_proc)
print(dataset)
print('---------- 3 ----------')
print('---------- 4 ----------')
# Add prompt template, such as <|System|>: xxx <|User|>: xxx <|Bot|>: xxx
if template_map_fn is not None:
print('---------- 5 ----------')
dataset = add_template_to_dataset(dataset, template_map_fn,
map_num_proc)
print(dataset)
print('---------- 6 ----------')
print('---------- 7 ----------')
for old, new in rename_maps:
print('---------- 8 ----------')
dataset = dataset.rename_column(old, new)
print(dataset)
print('---------- 9 ----------')
print(dataset)
print('---------- 10 ----------')
一下是输出的情况:
---------- 1 ----------
Dataset({
features: ['conversations'],
num_rows: 100
})
---------- 2 ----------
Dataset({
features: ['conversations', 'conversation'],
num_rows: 100
})
---------- 3 ----------
---------- 4 ----------
---------- 5 ----------
Dataset({
features: ['conversations', 'conversation'],
num_rows: 0
})
---------- 6 ----------
---------- 7 ----------
Dataset({
features: ['conversations', 'conversation'],
num_rows: 0
})
---------- 10 ----------
应该就是 add_template_to_dataset
的 L61 把数据都过滤掉了,请检查一下为何所有数据都是 len(example['conversation']) > 0
应该就是
add_template_to_dataset
的 L61 把数据都过滤掉了,请检查一下为何所有数据都是len(example['conversation']) > 0
就是这个地方屏蔽掉了,如果用msagent-bench样例数据是可以正常的,如果用自己整理的数据就报错,应该是cot语料整理的有问题,你们有数据格式模版吗?或能帮忙看一下我们的数据哪里有问题吗?非常感谢!
用msagent-bench样例数据: 一条msagent-bench样例数据.json
一条msagent-bench正常运行日志: 一条msagent-bench正常日志.log
自己整理的样例数据: 一条自己整理的数据.json
自己整理的数据运行报错日志: 一条自己整理的数据报错日志.log
自己整理的数据运行报错日志部分信息:
num_proc must be <= 1. Reducing num_proc to 1 for dataset of size 1.
Map: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 208.43 examples/s]
num_proc must be <= 1. Reducing num_proc to 1 for dataset of size 1.
Filter: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 417.43 examples/s]
Dataset({
features: ['conversations', 'conversation'],
num_rows: 0
})
Traceback (most recent call last):
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/tools/train.py", line 307, in <module>
main()
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/tools/train.py", line 303, in main
runner.train()
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1160, in train
self._train_loop = self.build_train_loop(
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 958, in build_train_loop
loop = LOOPS.build(
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/engine/runner/loops.py", line 32, in __init__
dataloader = runner.build_dataloader(
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 824, in build_dataloader
dataset = DATASETS.build(dataset_cfg)
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/dataset/modelscope.py", line 16, in process_ms_dataset
return process_hf_dataset(dataset, *args, **kwargs)
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 330, in process_hf_dataset
dataset = process(**kwargs)
File "/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 201, in process
print(dataset['conversation'][0])
IndexError: list index out of range
[rank1]:[E ProcessGroupGloo.cpp:144] Rank 1 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank.
谢谢!
@sxk000 数据格式就是按照msagent-bench进行设计就可以。
应该是自定义数据在 msagent_map_fn 处理的时候,触发了某些报错,导致 return {'conversation': []}
仔细跟踪一下这行代码的执行吧。 https://github.com/InternLM/xtuner/blob/a115e55e837b806708b1eb3ebb774fa5f943cc5d/xtuner/dataset/huggingface.py#L50
可以将num_proc设为1,以方便debug
@sxk000 数据格式就是按照msagent-bench进行设计就可以。
应该是自定义数据在 msagent_map_fn 处理的时候,触发了某些报错,导致
return {'conversation': []}
仔细跟踪一下这行代码的执行吧。
可以将num_proc设为1,以方便debug
这个msagent_react微调脚本用自己整理的数据调通了,是数据格式的问题,下面是具体的脚本、数据样例、运行命令:
脚本(txt改为py): internlm2_20b_qlora_msagent_react_e3_gpu8.txt
自己整理的一条数据样例: sft.json
运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
感谢!
你好!
我最今天在用cot格式的语料微调模型时,训练完成也不打印loss过程日志了,也不报错,请问是什么问题呢?
全参微调运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
训练参数:
use_varlen_attn = False
prompt_template = PROMPT_TEMPLATE.default
max_length = 8192
pack_to_max_length = False
# Scheduler & Optimizer
batch_size = 4 # per_device
accumulative_counts = 1
dataloader_num_workers = 2
max_epochs = 10
optim_type = AdamW
lr = 2e-5
betas = (0.9, 0.999)
weight_decay = 0
max_norm = 1 # grad clip
warmup_ratio = 0.03
# Save
save_steps = 500
save_total_limit = 1 # Maximum checkpoints to keep (-1 means unlimited)
# Evaluate the generation performance during the training
evaluation_freq = 500
自己整理的一条cot语料样例: COT语料样例.json
训练完整的日志: cot训练日志.txt
部分日志:
NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3 > log_sft/425sft-329pt-.log
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING]
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] *****************************************
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] *****************************************
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
2024-04-25 14:52:30,789 - modelscope - INFO - PyTorch version 2.2.1 Found.
2024-04-25 14:52:30,790 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-04-25 14:52:30,815 - modelscope - INFO - Loading done! Current index file version is 1.13.1, with md5 c03f3a22f842d6202e8e0127a31e0e2b and a total number of 972 components indexed
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:45<00:00, 2.17s/it]
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
训练完成后的模型文件:
ll work_dirs/internlm2_20b_qlora_msagent_react_e3_gpu8/
total 48
drwxr-xr-x 3 root root 49 Apr 25 14:52 20240425_145227
-rw-r--r-- 1 root root 8677 Apr 25 14:52 internlm2_20b_qlora_msagent_react_e3_gpu8.py
drwxr-xr-x 2 root root 4096 Apr 25 15:01 iter_40.pth
-rw-r--r-- 1 root root 85 Apr 25 15:03 last_checkpoint
-rwxr--r-- 1 root root 25314 Apr 25 15:03 zero_to_fp32.py
请问是应该这样训练COT模型能力吗?这样训练正常吗?训练过程中为什么不打印loss日志了呢?
感谢!
@
你好!
我最今天在用cot格式的语料微调模型时,训练完成也不打印loss过程日志了,也不报错,请问是什么问题呢?
全参微调运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
训练参数:
use_varlen_attn = False prompt_template = PROMPT_TEMPLATE.default max_length = 8192 pack_to_max_length = False # Scheduler & Optimizer batch_size = 4 # per_device accumulative_counts = 1 dataloader_num_workers = 2 max_epochs = 10 optim_type = AdamW lr = 2e-5 betas = (0.9, 0.999) weight_decay = 0 max_norm = 1 # grad clip warmup_ratio = 0.03 # Save save_steps = 500 save_total_limit = 1 # Maximum checkpoints to keep (-1 means unlimited) # Evaluate the generation performance during the training evaluation_freq = 500
自己整理的一条cot语料样例: COT语料样例.json
训练完整的日志: cot训练日志.txt
部分日志:
NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3 > log_sft/425sft-329pt-.log [2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] [2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] ***************************************** [2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. [2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] ***************************************** /root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( 2024-04-25 14:52:30,789 - modelscope - INFO - PyTorch version 2.2.1 Found. 2024-04-25 14:52:30,790 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer 2024-04-25 14:52:30,815 - modelscope - INFO - Loading done! Current index file version is 1.13.1, with md5 c03f3a22f842d6202e8e0127a31e0e2b and a total number of 972 components indexed Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:45<00:00, 2.17s/it] /root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) /root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details. warnings.warn(
训练完成后的模型文件:
ll work_dirs/internlm2_20b_qlora_msagent_react_e3_gpu8/ total 48 drwxr-xr-x 3 root root 49 Apr 25 14:52 20240425_145227 -rw-r--r-- 1 root root 8677 Apr 25 14:52 internlm2_20b_qlora_msagent_react_e3_gpu8.py drwxr-xr-x 2 root root 4096 Apr 25 15:01 iter_40.pth -rw-r--r-- 1 root root 85 Apr 25 15:03 last_checkpoint -rwxr--r-- 1 root root 25314 Apr 25 15:03 zero_to_fp32.py
请问是应该这样训练COT模型能力吗?这样训练正常吗?训练过程中为什么不打印loss日志了呢?
感谢!
@LZHgrla 方便帮忙看一下吗,谢谢!
@sxk000 看起来是因为数据太少导致了总iter太少,而log的频率超过了总的iter数,导致没有log。
可以查看一下你所使用的config中的这个参数 interval
@sxk000 看起来是因为数据太少导致了总iter太少,而log的频率超过了总的iter数,导致没有log。
可以查看一下你所使用的config中的这个参数
interval
你好,感谢回复!
不是log,config中参数 interval就是与你说的一样。现在的问题是训练时,没有每一批训练的loss打印,如下图这样的:
@sxk000 你的训练命令应该有点问题,qlora不支持deepspeed zero3的训练,请使用 zero2。同时,不打印训练log(即你所说的loss)的情况,可以贴一下config,我们看一下。
@LZHgrla 我用的是全参训练的,config文件: internlm2_20b_qlora_msagent_react_e3_gpu8.txt 这个上面好像不能上传py格式的文件,你把上面的文件txt格式改成py格式。
@sxk000 完整的训练log请张贴一下?
可以直接用 ```框起来发到帖子中
贴起来的话会超出最大字数限制,之前试过, 训练完整的日志: cot训练日志.txt 一条准备的cot语料: COT语料样例.json
一条自己准备的cot语料:
{
"conversations": [
{
"from": "system",
"value": "你有多种能力,可以通过插件集成的模型api来回复用户的问题,还能解答用户使用模型遇到的问题和模型知识相关问答。\n目前支持的插件信息如下,请自行判断是否需要调用插件来解决当前用户问题。\n若需要调用插件,则需要将插件调用请求按照json格式给出,必须包含plugin_name、parameters字段,对于api的调用过程中,在思考及调用之前使用作为标志,在思考和调用结束后使用,对于调用过程中用于向api传递参数的json字段,需要使用```json作为开始的标志,```作为结束的标志。\n1. {\"plugin_name\":\"ST-0000002801_ID-0000002387\",\"description\":\"针对用户输入的不锈钢工业焊接圆管市场价格查询信息,识别里面的元素,包括 品种、标准、材质、规格、指标、地区等\",\"parameters\":[{\"name\":\"date\",\"description\":\"时间或日期\"},{\"name\":\"breed_name\",\"description\":\"品种\",\"required\":true},{\"name\":\"standard_name\",\"description\":\"标准\"},{\"name\":\"mq_name\",\"description\":\"材质\"},{\"name\":\"sc_name\",\"description\":\"规格\"},{\"name\":\"metric_name\",\"description\":\"价格类型\",\"required\":true},{\"name\":\"area_name\",\"description\":\"地区\"}]}"
},
{
"from": "user",
"value": "我想了解一下不锈钢工业焊接圆管的价格"
},
{
"from": "assistant",
"value": "您好,您想了解不锈钢工业焊接圆管的价格,请问您需要哪种规格的圆管呢?需要调用ST-0000002801_ID-0000002387插件来查询不锈钢工业焊接圆管的价格\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}"
},
{
"from": "user",
"value": "Φ19*1的"
},
{
"from": "assistant",
"value": "明白了。您需要了解的是哪个地区的市场价格呢?需要调用ST-0000002801_ID-0000002387插件来查询Φ19*1的不锈钢工业焊接圆管的市场价格是多少\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"sc_name\": \"Φ19*1\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}"
},
{
"from": "user",
"value": "太原市的"
},
{
"from": "assistant",
"value": "好的,您想了解的是太原维太新材科技有限公司的444毛坯不锈钢工业焊接圆管,Φ19*1在太原市的市场价格吗?需要调用ST-0000002801_ID-0000002387插件来查询太原市的工业焊接圆管市场价格是多少\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"area_name\": \"太原市\", \"breed_name\": \"工业焊接圆管\", \"metric_name\": \"市场价格\"}}"
},
{
"from": "user",
"value": "是的,1981年7月5日的价格"
},
{
"from": "assistant",
"value": "好的,您需要调用ST-0000002801_ID-0000002387插件来查询1981年7月5日太原维太新材科技有限公司444毛坯不锈钢工业焊接圆管,Φ19*1,在太原市的市场价格。调用请求如下:需要调用ST-0000002801_ID-0000002387插件来查询1981年7月5日的价格\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"1981年7月5日\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}"
}
]
}
@LZHgrla 完整的日志:
NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3 > log_sft/425sft-329pt-.log
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING]
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] *****************************************
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
[2024-04-25 14:52:22,496] torch.distributed.run: [WARNING] *****************************************
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/mmengine/utils/dl_utils/setup_env.py:56: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
2024-04-25 14:52:30,789 - modelscope - INFO - PyTorch version 2.2.1 Found.
2024-04-25 14:52:30,790 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-04-25 14:52:30,815 - modelscope - INFO - Loading done! Current index file version is 1.13.1, with md5 c03f3a22f842d6202e8e0127a31e0e2b and a total number of 972 components indexed
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:32<00:00, 1.54s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:45<00:00, 2.17s/it]
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable - Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISMTOKENIZERS_PARALLELISM=(true | false)
=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable - Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
/root/miniconda3/envs/p310xtuner/lib/python3.10/site-packages/torch/nn/modules/module.py:1877: UserWarning: Positional args are being deprecated, use kwargs instead. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.state_dict for details.
warnings.warn(
@sxk000
log 应该会存到 log_sft/425sft-329pt-.log
里?
@sxk000 log 应该会存到
log_sft/425sft-329pt-.log
里?
是的,这个文件后来运行时覆盖了,已经不是原来的内容了,下周我再运行一下,再把完整的日志传给你吧,非常感谢!
@LZHgrla 今天我又尝试跑了几次,应该是训练时没有及时打印出来loss,后台log里面有。之前都是及时打印出来的,所以我误以为训练过程出现了问题。目前还在训练中,暂时没有出现问题!
感谢!
一条自己准备的cot语料:
{ "conversations": [ { "from": "system", "value": "你有多种能力,可以通过插件集成的模型api来回复用户的问题,还能解答用户使用模型遇到的问题和模型知识相关问答。\n目前支持的插件信息如下,请自行判断是否需要调用插件来解决当前用户问题。\n若需要调用插件,则需要将插件调用请求按照json格式给出,必须包含plugin_name、parameters字段,对于api的调用过程中,在思考及调用之前使用作为标志,在思考和调用结束后使用,对于调用过程中用于向api传递参数的json字段,需要使用```json作为开始的标志,```作为结束的标志。\n1. {\"plugin_name\":\"ST-0000002801_ID-0000002387\",\"description\":\"针对用户输入的不锈钢工业焊接圆管市场价格查询信息,识别里面的元素,包括 品种、标准、材质、规格、指标、地区等\",\"parameters\":[{\"name\":\"date\",\"description\":\"时间或日期\"},{\"name\":\"breed_name\",\"description\":\"品种\",\"required\":true},{\"name\":\"standard_name\",\"description\":\"标准\"},{\"name\":\"mq_name\",\"description\":\"材质\"},{\"name\":\"sc_name\",\"description\":\"规格\"},{\"name\":\"metric_name\",\"description\":\"价格类型\",\"required\":true},{\"name\":\"area_name\",\"description\":\"地区\"}]}" }, { "from": "user", "value": "我想了解一下不锈钢工业焊接圆管的价格" }, { "from": "assistant", "value": "您好,您想了解不锈钢工业焊接圆管的价格,请问您需要哪种规格的圆管呢?需要调用ST-0000002801_ID-0000002387插件来查询不锈钢工业焊接圆管的价格\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}" }, { "from": "user", "value": "Φ19*1的" }, { "from": "assistant", "value": "明白了。您需要了解的是哪个地区的市场价格呢?需要调用ST-0000002801_ID-0000002387插件来查询Φ19*1的不锈钢工业焊接圆管的市场价格是多少\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"sc_name\": \"Φ19*1\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}" }, { "from": "user", "value": "太原市的" }, { "from": "assistant", "value": "好的,您想了解的是太原维太新材科技有限公司的444毛坯不锈钢工业焊接圆管,Φ19*1在太原市的市场价格吗?需要调用ST-0000002801_ID-0000002387插件来查询太原市的工业焊接圆管市场价格是多少\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"2024年\", \"area_name\": \"太原市\", \"breed_name\": \"工业焊接圆管\", \"metric_name\": \"市场价格\"}}" }, { "from": "user", "value": "是的,1981年7月5日的价格" }, { "from": "assistant", "value": "好的,您需要调用ST-0000002801_ID-0000002387插件来查询1981年7月5日太原维太新材科技有限公司444毛坯不锈钢工业焊接圆管,Φ19*1,在太原市的市场价格。调用请求如下:需要调用ST-0000002801_ID-0000002387插件来查询1981年7月5日的价格\\n\\n调用请求如下:\\n{\"plugin_name\": \"ST-0000002801_ID-0000002387\", \"parameters\": {\"date\": \"1981年7月5日\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}" } ] }
@LZHgrla 你好! 1,想问一下像这种上下文语料格式是按照这种形式的吗?还是其他什么格式的呢? 2,如果是这样的格式,在训练模型时,每个assistant都会计算loss吗?还是说只有最后一个的assistant会计算loss,上午不做计算只起到上下文信息的作用呢? 3,如果是这样的格式,当中间某一问的插件名称(plugin_name)变化了,对应的system也会跟着变化,这种情况下要写多个system吗?如下,第一轮问的是plugin_name_1插件名,第二轮问的是plugin_name_2插件名。
{
"conversations": [
{
"from": "system",
"value": "你有多种能力1. {\"plugin_name\":\"plugin_name_1\",\"description\":\"针对用户输入的不锈钢工业焊接圆管市场价格查询信息,识别里面的元素,包括 品种、标准、材质、规格、指标、地区等\",\"parameters\":[{\"name\":\"date\",\"description\":\"时间或日期\"},{\"name\":\"breed_name\",\"description\":\"品种\",\"required\":true},{\"name\":\"standard_name\",\"description\":\"标准\"},{\"name\":\"mq_name\",\"description\":\"材质\"},{\"name\":\"sc_name\",\"description\":\"规格\"},{\"name\":\"metric_name\",\"description\":\"价格类型\",\"required\":true},{\"name\":\"area_name\",\"description\":\"地区\"}]}"
},
{
"from": "user",
"value": "我想了解一下不锈钢工业焊接圆管的价格"
},
{
"from": "assistant",
"value": "您好,您想了解不锈钢工业焊接圆管的价格,请问您需要哪种规格的圆管呢?需要调用plugin_name_1插件来查询不锈钢工业焊接圆管的价格\\n\\n调用请求如下:\\n{\"plugin_name\": \"plugin_name_1\", \"parameters\": {\"date\": \"2024年\", \"breed_name\": \"不锈钢工业焊接圆管\", \"metric_name\": \"市场价格\"}}"
},
{
"from": "system",
"value": "你有多种能力1. {\"plugin_name\":\"plugin_name_2\",\"description\":\"针对用户输入信息,识别地区、日期\",\"parameters\":[{\"name\":\"area_name\",\"description\":\"地区\"},{\"name\":\"date\",\"description\":\"时间\"}]}"
},
{
"from": "user",
"value": "今天上海天气怎么样?"
},
{
"from": "assistant",
"value": "明白了。您需要了解的是哪个地区的市场价格呢?需要调用plugin_name_2插件来查询Φ19*1的不锈钢工业焊接圆管的市场价格是多少\\n\\n调用请求如下:\\n{\"plugin_name\": \"plugin_name_2\", \"parameters\": {\"date\": \"今天\", \"area_name\": \"上海\"}}"
}
]
}
@HIT-cwh 好的,我尝试一下,感谢解答!
首先,感谢上海人工智能实验室及其成员对书生模型、代码框架、技术经验的分享!
用的环境安装流程是按照这个上面安装的:https://github.com/InternLM/xtuner/issues/447#issue-2170980022
运行命令:NPROC_PER_NODE=8 xtuner train internlm2_20b_qlora_msagent_react_e3_gpu8.py --deepspeed deepspeed_zero3
用的模型和数据:
Model
pretrained_model_name_or_path = '/apply/model/original/internlm2-20b' use_varlen_attn = False
Data
data_path = 'damo/MSAgent-Bench'
完整报错日志: 311pt.log
报错日志部分信息:
03/12 13:57:29 - mmengine - WARNING - Dataset Dataset has no metainfo.
main()
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/tools/train.py", line 303, in main
runner.train()
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1182, in train
self.strategy.prepare(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/_strategy/deepspeed.py", line 381, in prepare
model = self.build_model(model)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/_strategy/base.py", line 306, in build_model
model = MODELS.build(model)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, args, kwargs, registry=self)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 232, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(args) # type: ignore
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/model/sft.py", line 27, in init
self.llm = self._build_from_cfg_or_module(llm)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/model/sft.py", line 91, in _build_from_cfg_or_module
return BUILDER.build(cfg_or_mod)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, args, kwargs, registry=self)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(args) # type: ignore
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
return model_class.from_pretrained(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3502, in from_pretrained
) = cls._load_pretrained_model(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3926, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 805, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 345, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([92544, 6144]) in "weight" (which has shape torch.Size([0])), this look incorrect.
dataset_meta
in visualizer will be None. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'> quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized.low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. quantization_config convert to <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>low_cpu_mem_usage
was None, now set to True since model is quantized. [2024-03-12 13:57:34,036] [INFO] [partition_parameters.py:343:exit] finished initializing model - num_params = 339, num_elems = 19.86B Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/miniconda3/envs/p310xtuner3/lib/python3.10/site-packages/xtuner/tools/train.py", line 307, in请问这个问题应该如何解决呢?
谢谢!