asreview / synergy-dataset

SYNERGY - Open machine learning dataset on study selection in systematic reviews
Creative Commons Zero v1.0 Universal
62 stars 27 forks source link

Improve title search on OpenAlex #102

Open EmilyWes opened 20 hours ago

EmilyWes commented 20 hours ago

Currently when we search on OpenAlex by title (enrich.py), we remove the following characters: """()[]{}'@#:;"%&`’,.?!/\^®"""

Some initial testing showed that removing the ' results in not finding a paper. I assume OpenAlex made some improvements to the search and what it crashes on or not, so we can also update the list of special tokens to remove.