alirezamika / autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MIT License
6.22k stars 653 forks source link

It is unable to scrape <li> #100

Open amztc34283 opened 5 days ago

amztc34283 commented 5 days ago
Screenshot 2024-10-07 at 8 40 01 PM
wanted_list = ["Design, develop, test, refactor and scale backend implementations of new and existing consumer product features"]

scraper = AutoScraper()
result = scraper.build(url, wanted_list)

I am able to scrape the element in the wanted_list but similar elements are not scraped successfully, any tips and tricks could fix this?

alirezamika commented 5 days ago

please provide your full code including the url.

amztc34283 commented 4 days ago
wanted_list = ["Design, develop, test, refactor and scale backend implementations of new and existing consumer product features"]
scraper = AutoScraper()
result = scraper.build(url, wanted_list)
print(result)

Link: https://careers.chime.com/en/jobs/4225356002/backend-engineer/

alirezamika commented 4 days ago

What is your expected output?

amztc34283 commented 1 day ago

My expected output is the content of all the \<li> under the same \<ul> which is: Design, develop ... Work with ... Collaborate with ... Proactively find ...

alirezamika commented 22 hours ago

you can try the contain_sibling_leaves attribute.

result = scraper.get_result_similar(url, contain_sibling_leaves=True)