causal-models Search Results

1000+ results
for causal-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Dao-AILab/flash-attention #603

[feature request] Attention Sink support?

Hi Flash-Attention Team, Are there any plans to support Attention Sink style (https://arxiv.org/pdf/2309.17453v1.pdf) attention maps for causal language models? TIA!

arendu updated 5 months ago
10
geneontology/noctua #827

Queries to assess use of MF-to-BP relations in Noctua models

We are working on annotation documentation for MF-to-BP relations and would like to assess the extent to which relations, other than 'part of', have been used to link MFs to BPs in Noctua. We would…

vanaukenk updated 1 year ago
18
acl-org/acl-anthology #3223

Author Metadata: Chen Zhang

### Author Pages https://aclanthology.org/people/c/chen-zhang/ ### Type of Author Metadata Correction - [X] The author page wrongly conflates different people with the same name. - [ ] This author …

luciusssss updated 1 month ago
2
m-clark/book-of-models #33

Reorg

The following only covers reorganization, not outright trimming **PART I** - [x] Move all the shap and related discussion to model exploration - [x] Move model list in LM chapter to last chapter,…

m-clark updated 2 weeks ago
1
haotian-liu/LLaVA #1568

"Assertion `srcIndex < srcSelectDimSize` failed" in Docker o…

I run a LLaVA system as presented in this repository in a docker compose setup using official Cuda docker images and run into an error on some systems with my custom trained models. On a server using…

Careiner updated 3 weeks ago
1
comfyanonymous/ComfyUI #3751

Request for ComfyUI support for Hunyuan DIT

Dear ComfyUI team, I hope this email finds you well. My name is richard, and I am one of a developer of Hunyuan DIT, an innovative and effective model that utilizes the DIT architecture. Our projec…

xuhuaren updated 3 weeks ago
4
tensorflow/tensorflow #63025

Failure in convert Gemma 2B models to TfLite

I tried converting Google Gemma 2B models to TfLite. Found it ending in failure ### 1. System information - Ubuntu 22.04 - TensorFlow installation (installed with keras-nlp) : - TensorFlow l…

RageshAntonyHM updated 6 days ago
61
amirzandieh/HyperAttention #3

How to use with padding?

Can HyperAttention be used with a key_padding_mask to prevent padding tokens from being attended to in bidirectional attention? I understand this doesn't matter in the causal case, but is important fo…

andersonbcdefg updated 6 months ago
1
hassonlab/247-pickling #122

Move input generation and inference for each class of models…

The input generation, inference, and embeddings/logits extraction functions (as appropriate) `tfsemb_main.py` should be moved into separate scripts for `causal`, `mlm`, and `seq2seq` models.

hvgazula updated 1 year ago
5
OptimalScale/LMFlow #786

Causal LM finetuning

Hey, Great to see LISA implemented here. As for the background, I am trying to finetune models with LORA other techniques on domain data but the Task i am doing is Causal LM is Next word Predict…

harry7171 updated 2 months ago
3

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for causal-models

1000+ results
for causal-models