Closed lovelyczli closed 5 days ago
The pipelines should not be used for training. They are only meant for inference purposes, so gradient tracking cannot be done unless you modify the code to suit your needs. Instead, you will have to use each modeling component and write the training loop. You can see an example of training here
@a-r-r-o-w Thank you for your prompt reply and the training code. I noticed that the provided training code requires independent modules, including T5EncoderModel, CogVideoXTransformer3DModel, and AutoencoderKLCogVideoX.
This approach seems somewhat cumbersome, as our requirement does not involve training or updating model parameters—we only need to access the gradients.
Would simply removing the torch.no_grad() decorator from lines 478-485 in the local pipeline_cogvideox.py resolve the issue efficiently?
Thank you very much!
Yes, removing the torch.no_grad()
would make it possible to access gradients. The models, by default, are in .eval()
mode so layers like dropout will not take effect.
Hi @lovelyczli, I believe this should be answered with the above comment, so am marking this as closed. Please feel free to re-open if there's anything else we can help with
Describe the bug
When generating videos using the CogVideoXPipeline model, we need to access the gradients of intermediate tensors. However, we do not require additional training or parameter updates for the model.
We tried using register_forward_hook to capture the gradients, but this approach failed because the CogVideoXPipeline disables gradient calculations. Specifically, in pipelines/cogvideo/pipeline_cogvideox.py at line 478, gradient tracking is turned off with @torch.no_grad().
How can we resolve this issue and retrieve the gradients without modifying the model’s parameters or performing extra training?
Reproduction
Sample Code pipe = CogVideoXPipeline.from_pretrained( "THUDM/CogVideoX-2b", torch_dtype=torch.float16 ) video = pipe( prompt=prompt, num_videos_per_prompt=1, num_inference_steps=50, num_frames=49, guidance_scale=6, generator=torch.Generator(device="cuda").manual_seed(42), ).frames[0]
Pipeline Code Reference pipelines/cogvideo/pipeline_cogvideox.py at line 478 @torch.no_grad() @replace_example_docstring(EXAMPLE_DOC_STRING) def call( self, prompt: Optional[Union[str, List[str]]] = None, negative_prompt: Optional[Union[str, List[str]]] = None, height: int = 480, width: int = 720,
Logs
No response
System Info
Diffusers version: 0.30.3
Who can help?
No response