Open eljandoubi opened 3 weeks ago
Hey! is there a reason why you are using trust_remote_code = True
? This would use the online code, not the transformers
native one!
@ArthurZucker I set trust_remote_code = True
to bypass the warnings, but it had no effect on the error.
System Info
transformers==4.46.0 accelerate==1.0.1 sentencepiece==0.2.0 deepspeed==0.15.3
Who can help?
@muellerz @SunMarc @ArthurZucker @amyeroberts @qubvel
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
accelerate configr: compute_environment: LOCAL_MACHINE
debug: false deepspeed_config: deepspeed_multinode_launcher: standard gradient_accumulation_steps: auto gradient_clipping: 1.0 offload_optimizer_device: nvme offload_param_device: nvme zero3_init_flag: true zero3_save_16bit_model: false zero_stage: 3 distributed_type: DEEPSPEED downcast_bf16: 'no' machine_rank: 0 main_process_ip: 0.0.0.0 main_process_port: 0 main_training_function: main mixed_precision: bf16 num_machines: 3 num_processes: 24 rdzv_backend: c10d same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false
================================================
I launch the code using
Expected behavior
Have a NVMe offloaded model.