Open xkopenreview opened 1 week ago
for approach 2 url like this https://escholarship.org/api/pageData/uc/cognitivesciencesociety/46/0 (46/0) corresponds to the issue of the year
content.issue.sections[int].articles[int] has id, title, authors and abstract which can be used to look up the paper abstract
id is in forms of qt{actual id} and actual id can be used to link to the actual papers page in https://escholarship.org/uc/item/${actual id}
currently extract abstract requires only the url this will need to be changed to include title and venue at least
mindmodeling.org seems to be offline forever and majority of the abstract extraction failures are from this domain. so it's better to have a rule to handle paper html in this domain
there are two possible approaches:
abstract from both links could be wrong in terms of word breaks