alirezamika / autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python
MIT License
6.16k stars 648 forks source link

Website Structure #58

Closed sushidelivery closed 2 years ago

sushidelivery commented 3 years ago

Hello! Thank you so much for sharing your work!

I wanted to ask, if i trained my model on some website, then this website will change the website structure and styling , will it still work? Can I get the same data? or I will be needed to re-train it again?

sushidelivery commented 3 years ago

One more question, can I somehow get the whole information that is on the page? or I need to paste everything I want into 'wanted_list'?

alirezamika commented 3 years ago

Hi there,

If the changes are small like small changes in tag names, the scraper can still get the result by adjusting the attr_fuzz_ratio parameter. Otherwise you should probably re-train it.

If the items you want to get can be specified by regex, you can use regex instead of copying all of them.