rufusrock / scraping_tools

3 stars 1 forks source link

Non-specific conditions #4

Closed bramtayl closed 1 year ago

bramtayl commented 1 year ago

There's a couple of places in the code where you seem to classify results with somewhat unspecific conditions. Like, it's an ad if it has "sponsored" in it or its a rating if it has "." in it. It might be a good idea to avoid this if possible. After all, a product could have the word sponsored in it, and something that is not a rating could have a "." in it. As an alternative, maybe use more specific css selectors. I use SelectorGadget to make CSS selectors. A selector that extracts only ratings is ".a-size-small span:nth-child(1) .a-size-base"

rufusrock commented 1 year ago

Working on this!

rufusrock commented 1 year ago

Fixed the ad detection so it looks for specific ad elements. The rating one is just how I'm parsing the rating string.