anaskhan96 / soup

Web Scraper in Go, similar to BeautifulSoup
MIT License
2.18k stars 168 forks source link

Fixed issue with finding elements from multiple attr values #41

Closed mcrav closed 4 years ago

mcrav commented 5 years ago

In BeautifulSoup you can find elements like so: find_all('td', attrs={'class': ['class1', 'class2']}). This finds elements that have class1 and class2, regardless of what other classes they also have. This fix implements this behaviour.

anaskhan96 commented 4 years ago

While I understand the feature you want to implement, I don't think this is the right way to go about it. What you're planning to achieve is that a find should be successful if we want to check for either class1 or class2 by passing class1 class2 as the string. However, this would create problems if someone wanted to do an exact match of class1 class2. The right approach probably would be to keep an exact string match for an attribute's value, but abstract the find function in such a way that it accepts both a string and a list of strings. This way when the received param is a string (eg. class1), it does an exact match - the way it's been functioning now, but when it receives it list of strings (eg. ['class1', 'class2']), it checks if attributeContainsValue returns true for either of them.

anaskhan96 commented 4 years ago

A different implementation of this exists today brought in from #17 through Find and FindStrict, closing this PR.