issues
search
KWARC
/
llamapun
common language and mathematics processing algorithms, in Rust
https://kwarc.info/systems/llamapun/
GNU General Public License v3.0
25
stars
6
forks
source link
v3 of the paragraph dataset
#22
Closed
dginev
closed
5 years ago
dginev
commented
5 years ago
selects only the first paragraph in an AMS environment
also include structural markup of semantically stable sections, such as "Introduction" and "Abstract"
Reduces the size of the arXiv paragraph dataset to about 1/3
Reduces the size of the arXiv paragraph dataset to about 1/3