-
### Problem Description
Even on Real World Llama 2 70B Training Shapes, TE Linear FP8 is 1.5 to 2x slower than AMP BF16 Linear. Do you have any suggestions or magic env flags on how to improve perf…
-
ApplyPulidFlux
No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 577, 16, 64) (torch.bfloat16)
key : shape=(1, 577, 16, 64) (torch.bfloat16)
value : shape=(…
-
**Describe the bug**
CUDA out of memory when use whisper model. For example, when use whisper-tiny.en on small dataset like an4, even using GPU: A6000 (48GB) will encourter CUDA out of memeory issue…
-
### **Description:**
When running simple_rl_train.py, I encounter an error where the environment (Carla) seems to attempt a reset multiple times, causing an AttributeError due to obs_space being Non…
-
### Question
我在salience_detr.py中的 `class SalienceDETR(DNDETRDetector):` 类 加入了
`self.Re_Weight = WeightRefactor(sample_num)`,使用resnet50 backbone可以训练,但是swin_l 会报错:
`[2024-11-11 20:07:56 det.models.…
-
This project seems really interesting! My childhood self would've loved to analyze beetle shapes. I read a little bit about generalized procrustes analyses and think it is cool that you can perform th…
-
RuntimeError: Error(s) in loading state_dict for ImageProjModel:
size mismatch for proj.weight: copying a param with shape torch.Size([65536, 768]) from checkpoint, the shape in current model…
-
Error(s) in loading state_dict for ImageProjModel:
size mismatch for proj.weight: copying a param with shape torch.Size([65536, 768]) from checkpoint, the shape in current model is torch.Size([16384,…
-
i have done the following changes:
1. https://github.com/WZH0120/SAM2-UNet/blob/eb1c38d870358cbdd769c9721062f7bb888ef9b5/train.py#L15
2. edit the yaml https://github.com/WZH0120/SAM2-UNet/blob/eb1c3…
-
Error(s) in loading state_dict for DiffusionSceneLayout_DDPM:
Missing key(s) in state_dict: "diffusion.model.final_conv.weight", "diffusion.model.final_conv.bias", "fc_arrange_condition.0.weight", "…