issues
search
IllDepence
/
unarXive
A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
MIT License
259
stars
19
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
dataset
#24
flckv
opened
7 months ago
1
PDF version not specified
#23
yuezh000
opened
11 months ago
1
Replicate for recent data from Arxiv and Openalex
#22
shubhamagarwal92
closed
1 year ago
2
Handling of footnotes
#21
IllDepence
opened
1 year ago
0
Questions about the authors in this dataset
#20
Zivenzhu
closed
1 year ago
3
The error in paper structure
#19
Ma-Yongqiang
opened
1 year ago
2
Is there any efficient way to retrieve the OpenAlex label in the IMRaD set?
#18
SVLwoof
closed
1 year ago
2
Full dataset approximate size
#17
nicklausbrown
closed
1 year ago
1
Accessing actual figure image files
#16
IIZCODEII
closed
1 year ago
2
About citation matching
#15
Zivenzhu
closed
1 year ago
5
How can I get OpenAlex dump files?
#14
v-miazhang
opened
1 year ago
3
Fixed division by zero bug
#13
johankit
closed
1 year ago
1
DOI based matching should be done directly against OpenAlex DOIs (not using title)
#12
IllDepence
opened
1 year ago
0
For some papers, references are only matched up to part of the bib_entries list
#11
IllDepence
opened
1 year ago
0
cleanup converter comments
#10
dginev
closed
1 year ago
2
Bump lxml from 4.2.5 to 4.9.1
#9
dependabot[bot]
opened
2 years ago
0
How to separate the context sentences and the main citation sentence?
#8
fishiu
closed
2 years ago
1
Bump lxml from 4.2.5 to 4.6.5
#7
dependabot[bot]
closed
2 years ago
1
Is the data open source?
#6
fishiu
closed
2 years ago
2
Bump nltk from 3.4.5 to 3.6.5
#5
dependabot[bot]
opened
3 years ago
0
Bump lxml from 4.2.5 to 4.6.3
#4
dependabot[bot]
closed
2 years ago
1
Bump lxml from 4.2.5 to 4.6.2
#3
dependabot[bot]
closed
3 years ago
1
FORMULAS
#2
LuCeHe
closed
4 years ago
1
Dataset sample
#1
malteos
closed
4 years ago
1