-
Presently in transformer decoder, we do
```
h = x + self.attention.forward(self.attention_norm(x), start_pos, freqs_cis, mask)
out = h + self.feed_forward.forward(self.ffn_norm(h))
```
We have c…
-
### Prerequisite
- [X] I have searched [Issues](https://github.com/open-mmlab/mmrotate/issues) and [Discussions](https://github.com/open-mmlab/mmrotate/discussions) but cannot get the expected help.
…
-
I have trained the net and am trying to now train the gan. The net worked and finished and the gan works until interrupted (Colab). When I try and resume net training it works well. When trying to res…
-
**Describe the bug**
My model use deepspeed `PipelineModule(num_stages=4)` split into 4 parts, and my `deepspeed.moe.layer.MoE` is only set in the pipeline stage1 layer. When my model `train_batch`, t…
-
为啥我用您这个模型权重的时候报错RuntimeError: Error(s) in loading state_dict for CustomResNet18:
size mismatch for conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in cu…
-
I work with TF version 2.4.1,
Here is how I finetune the saved checkpoint of ResNet :
```
###Model ##############################
def create_model():
baseModel = tf.keras.models.load_model(…
-
RuntimeError: Error(s) in loading state_dict for EMSANet:
Missing key(s) in state_dict: "encoder.backbone_rgb.conv1.weight", "encoder.backbone_rgb.norm1.weight", "encoder.backbone_rgb.norm1.bias", "…
-
你好,我在跑demo.py时报错
initialize network with normal
Traceback (most recent call last):
File "D:\DMINet-main\demo.py", line 64, in
model.load_checkpoint(args.checkpoint_name)
File "D:\DMINet-…
-
**Description**
It would be cool to have the possibility to let a feature info click on a single feature show the results from multiple layers at the same time.
**Describe the solution you'd like*…
-
### System Info
Python 3.10.12
transformers-4.36.2
### Who can help?
@stevhliu @NielsRogge
### Information
- [X] The official example scripts
- [x] My own modified scripts
### Task…