Closed sanwei111 closed 2 months ago
hell,in the file of transformer-multibranch-v2,the class of TransformerEncoderLayer--the code are as follow: if args.encoder_branch_type is None:#default=None???? self.self_attn = MultiheadAttention( self.embed_dim, args.encoder_attention_heads, dropout=args.attention_dropout, self_attention=True, ) else: layers = [] embed_dims = [] heads = [] num_types = len(args.
I just wonder that do the args.encoder_branch_type equalstrue???
Hi, args.encoder_branch_type
is a list containing the encoder branch type defined in your training yml file.
In my case, I set the encoder_branch_type
in the training yml as encoder-branch-type: [attn:1:32:4, dynamic:default:32:4]
, where 32
represents the embedding dimension, and 4
stands for the attention head numbers.
Hope this helps!
hell,in the file of transformer-multibranch-v2,the class of TransformerEncoderLayer--the code are as follow: if args.encoder_branch_type is None:#default=None???? self.self_attn = MultiheadAttention( self.embed_dim, args.encoder_attention_heads, dropout=args.attention_dropout, self_attention=True, ) else: layers = [] embed_dims = [] heads = [] num_types = len(args. I just wonder that do the args.encoder_branch_type equalstrue???
Hi,
args.encoder_branch_type
is a list containing the encoder branch type defined in your training yml file. In my case, I set theencoder_branch_type
in the training yml asencoder-branch-type: [attn:1:32:4, dynamic:default:32:4]
, where32
represents the embedding dimension, and4
stands for the attention head numbers. Hope this helps!
thx,what'S the meaning of [attn:1:32:4, dynamic:default:32:4]?could you show some details about the list
thx,what'S the meaning of [attn:1:32:4, dynamic:default:32:4]?could you show some details about the list
As I mentioned in my last reply, args.encoder_branch_type
should not be a boolean value, instead it should be a list recording the branch type of your encoder. As for 32
and 4
, they represent params embed_dim
and num_head
when initializing MultiheadAttention
and DynamicconvLayer
modules.
https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/configs/cnndm/attention/multibranch_v2/embed496.yml#L36
You can find more details on these two params at the get_layer
method in TransformerEncoderLayer
module.
https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/fairseq/models/transformer_multibranch_v2.py#L617-L645
Find more details about MultiheadAttention module at
https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/fairseq/modules/multihead_attention.py#L15-L27
thx,what'S the meaning of [attn:1:32:4, dynamic:default:32:4]?could you show some details about the list
As I mentioned in my last reply,
args.encoder_branch_type
should not be a boolean value, instead it should be a list recording the branch type of your encoder. As for32
and4
, they represent paramsembed_dim
andnum_head
when initializingMultiheadAttention
andDynamicconvLayer
modules. https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/configs/cnndm/attention/multibranch_v2/embed496.yml#L36You can find more details on these two params at the
get_layer
method inTransformerEncoderLayer
module. https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/fairseq/models/transformer_multibranch_v2.py#L617-L645Find more details about MultiheadAttention module at https://github.com/mit-han-lab/lite-transformer/blob/de9631cbbbb9c42dce3616a1e95fb59a89ab696e/fairseq/modules/multihead_attention.py#L15-L27
thx a lot!!! one more,as shown below,
for layer_type in args.encoder_branch_type: embed_dims.append(int(layer_type.split(':')[2])) heads.append(int(layer_type.split(':')[3])) layers.append(self.get_layer(args, index, embed_dims[-1], heads[-1], layer_type)) self.self_attn = MultiBranch(layers, embed_dims)
the above code appear in the encoderlayer class,as you said,args.encoder_branch_type ==[attn:1:160:4, lightweight:default:160:4],but it lead to some errors,how to comprehend it????
Thank you for your interest in our project. Unfortunately, this repository is no longer actively maintained, so we will be closing this issue. If you have any further questions, please feel free to email us. Thank you again!
hell,in the file of transformer-multibranch-v2,the class of TransformerEncoderLayer--the code are as follow: if args.encoder_branch_type is None:#default=None???? self.self_attn = MultiheadAttention( self.embed_dim, args.encoder_attention_heads, dropout=args.attention_dropout, self_attention=True, ) else: layers = [] embed_dims = [] heads = [] num_types = len(args.
I just wonder that do the args.encoder_branch_type equalstrue???