Closed qinghew closed 1 year ago
are you using torch.compile or something like this? What changes if any did you add? Thanks
No change. torch.compile seems to run automatically. I have solved this problem by add these to train.py: import torch._dynamo.config torch._dynamo.config.suppress_errors = True
This two lines will suppress the error, but i'm not sure if this causes a performance drop.
I see. we trained our model before torch 2.0 comes out. We will check it later. Thanks
Thanks. I try to use the version of torch in cog.yaml and pip install xformers==0.0.16 which is available.
Because the author set up the environment both in the cog.yaml file and in the README, and there are gaps in it. Since xformers requires torch version 2.0 or higher, the torch version in cog.yaml is not available. Here is the environment setup for training (no problem for inference), conda install cudatoolkit=11.6 pip install torch==2.0.1 torchvision torchaudio pip install scipy==1.9.3 transformers==4.29.2 accelerate==0.19.0 clip==0.2.0 diffusers==0.16.1 xformers triton gradio datasets evaluate
but get this error: [2023-07-22 13:08:03,869] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward Traceback (most recent call last): File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/train.py", line 456, in
train()
File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/train.py", line 357, in train
return_dict = model(batch, noise_scheduler) # batch["pixel_values"].shape torch.Size([16, 3, 512, 512])
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 82, in forward
return self.dynamo_ctx(self._orig_mod.forward)(*args, *kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 209, in _fn
return fn(args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/accelerate/utils/operations.py", line 521, in forward
return model_forward(*args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/accelerate/utils/operations.py", line 509, in call
return convert_to_fp32(self.model_forward(*args, *kwargs))
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(args, kwargs)
File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/model.py", line 504, in forward
vae_dtype = self.vae.parameters().next().dtype
File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/model.py", line 507, in
latents = self.vae.encode(vae_input).latent_dist.sample()
File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/model.py", line 537, in
encoder_hidden_states = self.postfuse_module(
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors
return callback(frame, cache_size, hooks)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame
result = inner_convert(frame, cache_size, hooks)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 104, in _fn
return fn(*args, *kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert
return _compile(
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 324, in _compile
out_code = transform_code_object(code, transform)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object
transformations(instructions, code_options)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 311, in transform
tracer.run()
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run
super().run()
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
and self.step()
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
getattr(self, inst.opname)(inst)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 342, in wrapper
return inner_fn(self, inst)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 965, in CALL_FUNCTION
self.call_function(fn, args, {})
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 474, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 259, in call_function
return super().call_function(tx, args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 92, in call_function
return tx.inline_user_function_return(
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 510, in inline_user_function_return
result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1806, in inline_call
return cls.inlinecall(parent, func, args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1834, in inlinecall
sub_locals, closure_cells = func.bind_args(parent, args, kwargs)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 159, in bind_args
[
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 160, in
wrap(val=arg, source=source)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 59, in wrap_bound_arg
assert isinstance(val, VariableTracker), typestr(val)
AssertionError: builtin_function_or_method
from user code: File "/home/qinghewang/codes/multi_subject/fastcomposer/fastcomposer-main/fastcomposer/model.py", line 185, in forward text_object_embeds = fuse_object_embeddings(
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting: torch._dynamo.config.suppress_errors = True
Global step: 0: 0%| | 0/150000 [00:50<?, ?it/s] Traceback (most recent call last): File "/home/qinghewang/anaconda3/envs/fastcomposer/bin/accelerate", line 8, in
sys.exit(main())
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/accelerate/commands/launch.py", line 918, in launch_command
simple_launcher(args)
File "/home/qinghewang/anaconda3/envs/fastcomposer/lib/python3.10/site-packages/accelerate/commands/launch.py", line 580, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/qinghewang/anaconda3/envs/fastcomposer/bin/python', 'fastcomposer/train.py', '--pretrained_model_name_or_path', 'runwayml/stable-diffusion-v1-5', '--dataset_name', '/home/qinghewang/codes/multi_subject/fastcomposer/ffhq_wild_files', '--logging_dir', 'logs/stable-diffusion-v1-5/ffhq/postfuse-localize-ffhq-1_5-1e-5', '--output_dir', 'models/stable-diffusion-v1-5/ffhq/postfuse-localize-ffhq-1_5-1e-5', '--max_train_steps', '150000', '--num_train_epochs', '150000', '--train_batch_size', '16', '--learning_rate', '1e-5', '--unet_lr_scale', '1.0', '--checkpointing_steps', '200', '--mixed_precision', 'bf16', '--allow_tf32', '--keep_only_last_checkpoint', '--keep_interval', '10000', '--seed', '42', '--image_encoder_type', 'clip', '--image_encoder_name_or_path', 'openai/clip-vit-large-patch14', '--num_image_tokens', '1', '--max_num_objects', '4', '--train_resolution', '512', '--object_resolution', '224', '--text_image_linking', 'postfuse', '--object_appear_prob', '0.9', '--uncondition_prob', '0.1', '--object_background_processor', 'random', '--disable_flashattention', '--train_image_encoder', '--image_encoder_trainable_layers', '2', '--object_types', 'person', '--mask_loss', '--mask_loss_prob', '0.5', '--object_localization', '--object_localization_weight', '1e-3', '--object_localization_loss', 'balanced_l1', '--resume_from_checkpoint', 'latest', '--report_to', 'wandb']' returned non-zero exit status 1.