radiolarian / AO3Scraper

A Python scraper for getting fan fiction content and metadata from Archive of Our Own.
175 stars 56 forks source link

BS4 error #4

Closed ricketybridge closed 6 years ago

ricketybridge commented 6 years ago

When I run the command in your README (i.e. getting work IDs for 100 Sherlock fics), I get this error:

Traceback (most recent call last):
  File "ao3_work_ids.py", line 261, in <module>
    main()
  File "ao3_work_ids.py", line 257, in main
    process_for_ids(header_info)
  File "ao3_work_ids.py", line 240, in process_for_ids
    ids = get_ids(header_info)
  File "ao3_work_ids.py", line 108, in get_ids
    soup = BeautifulSoup(req.text, "lxml")
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bs4/__init__.py", line 165, in __init__
    % ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Happens with both Python 2 and 3.

ricketybridge commented 6 years ago

Apparently this was a parser issue within Python. Resolved by installing lxml via pip.