alirezamika / autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MIT License
6.16k stars 648 forks source link

Extracting webpages with a collections of items (structurally) #39

Closed ws1088 closed 3 years ago

ws1088 commented 3 years ago

Hi, How do I extract a list of a list of text from a webpage with:

Name: Amy, Age: 13
Name: Bobby, Age: 33
Name: Chris, Age: 54

Ideally I would like the results to be:

[['Amy', '13'],
 ['Bobby', '33],
 ['Chris', '54']
]
alirezamika commented 3 years ago

Hey! Can you share your webpage?

ZiggerZZ commented 3 years ago

Hi!

I'm interested in the same thing. An example: scrape item names and prices from https://www.ubereats.com/london/food-delivery/mcdonalds-balham-high-road/DJEVCmFOTpy3vR-CB7RCDA .

P.S. Actually, I'd only need to scrape items that are Sold out, but I guess it can be a second challenge.

alirezamika commented 3 years ago

Hi, use wanted_dict with aliases, then get the results grouped by rule and fine tune the rules. then you can use group_by_alias results as you want.

rooterkyberian commented 3 years ago

It would be good to have example for this in docs since it seems like a common use case.

Is it possible for autoscraper to "scrape"

<div>Name: Amy, Age: 13</div>
<div>Name: Bobby,</div>
<div>Name: Chris, Age: 54</div>

as well? i.e. group data entries when we have different number of instances for particular field