Closed mario-mazo closed 2 years ago
Hello
Im thinking about using crawly for a project but im not sure whats is the best way to scrap dynamic urls
Like I need to scrap www.something.com/site/AAA all the way to www.something.com/site/ZZZ. The last AAA-ZZZ is a unique identifier
So should I pass the identifier that to start_urls? or should I fetch inside the parse_item
fetch
parse_item
thanks
There are two ways:
start_urls
which method depends on how many url permutations you are looking at. if the number is absurdly high (like hundreds of thousands) of urls, then go with method 2.
Hello
Im thinking about using crawly for a project but im not sure whats is the best way to scrap dynamic urls
Like I need to scrap www.something.com/site/AAA all the way to www.something.com/site/ZZZ. The last AAA-ZZZ is a unique identifier
So should I pass the identifier that to start_urls? or should I
fetch
inside theparse_item
thanks