issues
search
EleutherAI
/
pythia
The hub for EleutherAI's work on interpretability and learning dynamics
Apache License 2.0
2.14k
stars
156
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Refactoring
#168
sunnyddelight
closed
5 hours ago
1
open-source the training data used between two adjacent checkpoints
#167
txy77
opened
18 hours ago
0
make README easier to follow
#166
Arvid-pku
opened
3 weeks ago
1
Issue while showering NLO events with NLO
#165
rash-eng
opened
3 weeks ago
0
How to use the Huggingface dataset /EleutherAI/pythia-memorized-evals in predictable-memorization?
#164
Happy2Git
opened
1 month ago
0
cache_dir cannot be the same as model name
#163
arunasank
opened
1 month ago
0
Pythia 12b flash config
#162
jvendrow
opened
1 month ago
0
Sparse
#161
sunnyddelight
closed
1 month ago
1
how to use Pythia
#160
gaohang
opened
1 month ago
0
Convert to GGUF
#159
yanxon
opened
2 months ago
0
Reshape error in batch viewer
#158
activatedgeek
closed
2 months ago
1
Update README.md
#157
borgr
closed
2 months ago
1
tokenizer.pad_token
#156
vincent317
opened
3 months ago
1
instruct-tuned pythia
#155
WilliamsToTo
opened
3 months ago
0
Correct link to huggingface
#154
l-ma
closed
3 months ago
1
Provide the shuffled index_mapping npy files for ease of reproducing training data
#153
ziqi-zhang
opened
3 months ago
0
Optimizer states in HF format
#152
seyuboglu
opened
4 months ago
1
Weird inconsistency in Tokenizer vocabulary
#151
javirandor
opened
4 months ago
1
Is there existing code to resume training from specific checkpoint?
#150
javirandor
closed
4 months ago
1
"gas" configuration doesn't do anything
#149
segyges
opened
5 months ago
0
Adding _warmup_mmap_file function missing from MMapIndexedDataset
#148
rdiehlmartinez
closed
4 months ago
1
Add training loss data
#147
pietrolesci
opened
5 months ago
1
Update README.md
#146
speed1313
opened
5 months ago
1
Would it be possible to share training loss curves on the original Pythia models?
#145
itsnamgyu
closed
5 months ago
4
[Pythia on Pile-Dedup] Training for ~1.5 epochs: how to identify the repeated sequences (i.e., the additional .5 epoch)?
#144
pietrolesci
opened
6 months ago
3
Fix ToC
#143
osanseviero
closed
6 months ago
1
Details about "EleutherAI/pythia-160m-seed*" models
#142
IanMagnusson
closed
7 months ago
3
Missing / undownloadable checkpoints on huggingface
#141
mirandrom
closed
5 months ago
3
.
#140
ParthaKrPaul
closed
7 months ago
2
Wrong files in eval?
#139
borgr
opened
7 months ago
0
Pytia or GPT-neox?
#138
borgr
closed
7 months ago
1
Deduplicated Pile dataset with Domain Attribution
#137
michaelduan8
closed
7 months ago
0
Replicating the Training Data Order
#136
prakharg24
closed
7 months ago
1
Inconsistent init methods of pythia-6.9b model
#135
mqyqlx
opened
7 months ago
2
Update README.md
#134
segyges
closed
7 months ago
0
Add checksum for data from huggingface
#133
segyges
closed
8 months ago
1
The value of weight decay
#132
yehuitang
closed
8 months ago
1
Update requirements.txt
#131
segyges
closed
8 months ago
0
Typos in readme.md
#130
segyges
closed
8 months ago
1
Model Initialization Question
#129
yanlai00
closed
8 months ago
1
Update readme to load preshuffled datasets
#128
uSaiPrashanth
closed
8 months ago
0
Has the data been shuffled?
#127
Lisennlp
opened
8 months ago
2
Reading data is slowly!
#126
Lisennlp
opened
8 months ago
1
Automatically calculate shard size
#125
uSaiPrashanth
closed
8 months ago
0
Automatically determine shard size
#124
uSaiPrashanth
closed
8 months ago
1
Batch Viewer : Why Sequence Length 2049?
#123
prakharg24
closed
8 months ago
12
The performance about pythia and LLaMA model architecture
#122
peiyingxin
closed
8 months ago
1
Any results on the validation set?
#121
chujiezheng
opened
9 months ago
1
README Update
#120
StellaAthena
closed
8 months ago
1
Update README.md
#119
StellaAthena
closed
9 months ago
0
Next