Closed Narasimha1997 closed 4 years ago
Sounds like a good idea. I think it can also be from different urls. Suppose you want to build a price scraper, you can feed it with samples of different websites, like amazon, ebay, etc.
Yes! that would be good. This will be like a Scraping marketplace. Learn rules -> Increment the learning (add more rules from different sources) -> Publish the rules so others can consume it. Just like TensorflowHub
and ModelZoo
for Deeplearning, haha
Awesome!
As of now, the rules are formed at once based on the targets specified in
wanted_list
and the stack list is generated for those targets. Sometimes there will be scenarios where I have to update the existing stack list with new rules learnt from different set of targets on the same URL. As seen in thebuild
method, you create a new stack list every time a build method is called. Provide anupdate
method, that updates the stack list simply by appending the new rules learnt from new set of targets. This will be very useful functionality because it will allow developers to incrementally add new targets by retaining the older rules.