longlm Search Results - Githubissues

25 results
for longlm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

datamllab/LongLM #39

Run example.py Error: Failed to modify the attention method…

Hello. I just simplily run the example.py and met the error in the "=====**SelfExtend using Torch**======" part: ``` Traceback (most recent call last): File "./LongLM/example.py", line 112, in …

tuzeao-tal updated 1 month ago
2
FasterDecoding/SnapKV #1

Questions on paper and code [prompting for mistral, position…

Hello :) Thank you for the excellent work and for sharing your code. I've learned a lot and have a few questions about the paper and settings: - In Figures 2 and 3, what specifically do "prompt" …

MarsJacobs updated 2 months ago
8
datamllab/LongLM #31

Something wrong in modify_method_of_instance function

Thanks for your code. I encountered the following issue while trying to extend the context length of Qwen1.5-14B-Chat. Do you know how I can fix this Exception? Many THX! ```python Traceback (most…

meizhen-nlp updated 2 months ago
2
datamllab/LongLM #35

Question about equation 4 and Table 5 caption in paper

Hi! I have a question that may seem simple, but I think I'm overlooking something. Assume Phi-2's context window is 2K. When we apply a group size ($G_s$) of 4 and neighbor tokens ($w_n$) of 512, a…

MarsJacobs updated 1 month ago
3
datamllab/LongLM #34

Questions regarding group query/key positional index

Hi! I love your work and code implementation. Learned a lot. I have couple questions regarding code implementation. https://github.com/datamllab/LongLM/blob/6e25a310a3aa9f49b0c74f9a277d40d897e97c2…

MarsJacobs updated 2 months ago
2
datamllab/LongLM #27

OOM on LongBench

Hi how did you evaluate on LongBench? I tried to map your LLama to extended version with https://github.com/datamllab/LongLM/blob/6b841932d5267e610a65eb228923e16746270dce/llama_example.py#L40 it ge…

YerongLi updated 3 months ago
1
datamllab/LongLM #24

Example for gemma & use with Ollama

Any example of how the extention can be done on the gemma models? Also, once the models are extended, how can they be exported to use with Ollama? Ollama uses gguf format

vishal-android-freak updated 3 months ago
5
QwenLM/Qwen #937

💡 [REQUEST] - <title>如何在Qwen14B和72B中应用SelfExtend

In the paper [LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning](https://arxiv.org/pdf/2401.01325.pdf), the authors describe a method to extend the context-window of any rope-based model…

ArcherShirou updated 2 months ago
3
datamllab/LongLM #23

llama_self_extend_patch_4_36 is not work

when I use 4.36.2, it's not work. But if I use 4.32.0, it's work. I only changed "import llama_self_extend_patch as LlamaSE" in "llama_example.py" to "import llama_self_extend_patch_4_36 as LlamaSE"

YL-9 updated 3 months ago
7
AutoGPTQ/AutoGPTQ #584

marlin fail ValueError: `infeatures` must be divisible by 12…

model: qwen1.5 14b chat auto_gptq : 0.8.0dev+cu118,0.7.0dev+cu118 quantize code: ``` quantize_config = BaseQuantizeConfig( bits=4, # quantize model to 4-bit group_size=128, # it is rec…

Minami-su updated 3 months ago
7

上一页 1...1 2 3...3 下一页

25 results for longlm

25 results
for longlm