MaterialEyes / exsclaim

A toolkit for the automatic construction of self-labeled materials imaging datasets from scientific literature
GNU General Public License v3.0
30 stars 8 forks source link

Wiley scraper returns no articles #24

Open trevorspreadbury opened 2 years ago

trevorspreadbury commented 2 years ago

Describe the bug Wiley scraper fails to return anything.

To Reproduce Steps to reproduce the behavior:

  1. Create the following exsclaim query:
    {
    "name": "wiley-nano",
    "journal_family": "wiley",
    "maximum_scraped": 2,
    "sortby": "relevant",
    "query":
    {
        "search_field_1":
        {
            "term":"nano",
            "synonyms":["nanoparticle"]
        }
    },
    "open": true,
    "save_format": ["boxes"],
    "logging": ["print", "exsclaim.log"]
    }
  2. Run the following command: Using run.py

Expected behavior Journal article urls are collected and searched through for figures.

Outputs Just a moment... Please turn JavaScript on and reload the page. Checking your browser before accessing onlinelibrary.wiley.com. Please enable Cookies and reload the page. This process is automatic. Your browser will redirect to your requested content shortly. Please allow up to 5 seconds… Redirecting… DDoS protection by Cloudflare Ray ID: 6f4c0f528d116326

Environment (please complete the following information):

Additional context This may require converting the Wiley scraper to be a dynamic one using selenium like RSC or other methods to avoid bot detection.