-
### System Info
Running `infinity` via docker (`michaelf34/infinity:latest`) + using the REST API to call the model
### Information
- [X] Docker
- [ ] The CLI directly via pip
### Tasks
- [X] An …
-
感谢你如此好的代码实现,他对我的帮助很大,但是我在使用load_from_name 函数时,我发现并不支持flash-attn ,因此我自己实现了这一块的代码,但是我不确定实现是否正确,尽管它可以正常运行。
以下是代码片段
```
###### ------- ps: add use_flash_attention keyword ------- ######
def load_fro…
-
### System Info
Hi guys, I have some complex models where I use just part of sub-models of transformers e.g. Below I used `AutoModelForCausalLM.from_pretrained()` but normally it would be something…
-
Dear professor Peng Qian,
Recently I have read the latest paper published by your team in IJCAI-21-《Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and…
-
As you probably know, yesterday Google released Gemma2 with superior performance and robustness https://storage.googleapis.com/deepmind-media/gemma/gemma-2-report.pdf
One of the key changes was log…
-
**While running the example code in Readme.md**
`from local_gemma import LocalGemma2ForCausalLM
from transformers import AutoTokenizer
import os
os.environ['HUGGINGFACEHUB_API_TOKEN'] = ''
os.…
-
So, I'm trying to implement an attention module in the V9 head architecture after the ELAN block, but it seems like I'm not keeping up with the underlying architecture of YOLOv9. I want to implement t…
-
`The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_…
-
Thank you for sharing the project. I am a beginner in this field, and currently, I have encountered issues while trying to save the entire pytorch model and exporting it to onnx.
While saving the …
-
Seems the model merge is giving the WARNING SHAPE MISMATCH error again. I looked into the merge_patcher file seen from other errors from back in March and see that the typo was corrected, so seems the…