issues
search
google-research-datasets
/
wiki-atomic-edits
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
106
stars
8
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
What about editions?
#7
callzhang
opened
2 years ago
0
Extracting differences between revisions from wikipedia dumps
#6
slemonide
closed
4 years ago
1
Fix count of edits by adding `million`
#5
gurunathparasaram
opened
5 years ago
3
Regarding pre-processing tools used
#4
ajaynagesh
closed
5 years ago
1
corpus contents
#3
BonnieLWebber
closed
5 years ago
1
deletions.tsv have insertion examples
#2
pcyin
closed
6 years ago
2
English portion of the dataset seems to be corrupted
#1
pcyin
closed
6 years ago
2