Closed zqevans closed 8 months ago
@zqevans oo, you are the first to use this feature! i moved some logic around, and now you can pass in prepend_mask
Amazing, thanks!
Oh, forgot to mention I'm using the ContinuousTransformerWrapper for this as I'm hoping to do it using latent diffusion. Could you implement it there as well?
@zqevans yup, you got it! https://github.com/lucidrains/x-transformers/commit/3039bccfcaa96d66f73f739ee1c5e62c612d82b0
What a legend. Thanks again!
Getting the error name 'b' is not defined
for the continuous one. Looks like the batch size variable is called batch
in that version.
@zqevans oops, should be good now 🤞
I'm looking to implement something like VALL-E with phoneme embeddings prepended to the transformer input using
prepend_embeds
. I would want to mask out padded tokens in this case. Looking at the implementation, it's not clear to me how I would mask out prepended embeddings.Does a
prepend_embeds_mask
make sense to add?Should I be prepending this to the transformer input myself and using the normal
mask
input to create the attention mask?