-
It would be convenient to allow the encoder [output_size](https://github.com/CUNY-CL/yoyodyne/blob/master/yoyodyne/models/modules/lstm.py#L99) to be different from the TransformerDecoder embedding siz…
-
Hi! Great work.
I see there's a "force image generation" option in the gradio demo.
I wonder how to implement this in code? Can anyone enlighten me on this?
Thanks.
-
Hi, I have a question regarding the image generation process, specifically the `generate_image` function at https://github.com/baaivision/Emu/blob/main/models/modeling_emu.py#L185
According to this…
-
### Description
I am trying to pass `PRNGKey`s to a function, which is integrated by `odeint`.
Here is a simplified example reproducing the problem:
```python
from functools import partial
im…
-
White running test.ipynb file, we are running into this error.
'CLIPTextTransformer' object has no attribute '_build_causal_attention_mask'
Followed same process, as installation.
-
您好!我参考您的代码,将应用于GPT2的Attentioner Manager应用到Llama上,然后得到了saliency分数,每一层都是[1,1,seq_len,seq_len],部分具体数值如下:
我想知道这里每一层的saliency分数的具体含义?
我的代码如下:
```
class LlamaAttentionManager(AttentionerManagerBase):
…
-
Hi, I would like to ask why the attention mask is not used in the prefill stage.
I want to output the attention scores matrix in prefill stage. Is the code below right?
```
if spec: # s…
-
### Your current environment
Referring to the issue #5181 "The Offline Inference Embedding Example Fails", the method LLM.encode() can only work for embedding models. Is there any idea to get the ou…
-
### 🐛 Describe the bug
when I set the `dropout_p=0.0`, the result is different. But `dropout_p=-1`, the result is same. Maybe the op scaled_dot_product_attention has some bug. Please fix it, thank…
-
Tisane currently provides two types of conceptual relationships: `causes` and `associates_with`. This doc covers when and how to use these verbs.
If a user provides associates_with, we walk them t…
emjun updated
2 years ago