issues
search
microsoft
/
LongRoPE
LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.
MIT License
100
stars
10
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
A little code modify to support newer version of transformer
#15
sysuyy
opened
5 days ago
0
Can you open source your training code
#14
yangqqq-yq
closed
4 days ago
0
2M rescale factor
#13
momandai
opened
1 month ago
0
how can I finetune llama2-7B with 128K seqence length by using only 8 A100 GPUS?
#12
momandai
closed
2 months ago
0
Why doesn't Phi-3 adopt the design of "initial tokens"?
#11
Mooler0410
closed
2 months ago
0
why target_ids is input_ids' clone?
#10
momandai
closed
3 months ago
0
Bump torch from 2.1.0 to 2.2.0
#9
dependabot[bot]
opened
3 months ago
0
Evolutionary search parameters
#8
Shirobokov-Andrew
opened
3 months ago
2
unexpected keyword argument 'attn_implementation'
#7
momandai
opened
3 months ago
1
Update README and examples
#6
Starmys
closed
3 months ago
0
Update examples
#5
Starmys
closed
4 months ago
0
Code Refactor and Code Clean
#4
JiahangXu
closed
4 months ago
0
Delete customized codeql
#3
JiahangXu
closed
5 months ago
0
Merge LongRoPE branch
#2
JiahangXu
closed
5 months ago
1
Action required: migrate or opt-out of migration to GitHub inside Microsoft
#1
microsoft-github-policy-service[bot]
closed
7 months ago
5