pranftw / openreview_scraper

Scrape papers from OpenReview using OpenReview API
14 stars 3 forks source link

Doesn't work for ICLR 2024? #3

Open NoviScl opened 2 months ago

NoviScl commented 2 months ago

Great repo!

I realize it doesn't seem to work for ICLR 2024 when I tried:

years = [
    '2024'
]

conferences = [
    'ICLR'
]
keywords = [
    'language model'
]

def modify_paper(paper):
  paper.forum = f"https://openreview.net/forum?id={paper.forum}"
  paper.content['pdf'] = f"https://openreview.net{paper.content['pdf']}"
  return paper

# what fields to extract
extractor = Extractor(fields=['forum'], subfields={'content':['title', 'keywords', 'abstract', 'pdf', 'match']})

# if you want to select papers manually among the scraped papers
# selector = Selector()

# select all scraped papers
selector = None

scraper = Scraper(conferences=conferences, years=years, keywords=keywords, extractor=extractor, fpath='examples.csv', fns=[modify_paper], selector=selector)

# adding filters to filter on
scraper.add_filter(title_filter)
scraper.add_filter(keywords_filter)
scraper.add_filter(abstract_filter)

scraper()

But it works fine when I change the year from 2024 to 2023. Any idea why?