Closed ghandic closed 6 years ago
Good find. I'll look into this.
Now it is not working with multiple class items, as it was in the previous build. In source html tags looks like this:
<div class="rasp_place s-place"/>
And I've searched them with this code:
block.Find("div", "class", "rasp_place s-place")
In current build my old code doesn't work anymore, only this works:
block.Find("div", "class", "rasp_place")
Also I think that you should revert to the previous logic as now it became not obvious:
<li class="btn_rasp inactive">
In current build all items with class btn_rasp
will be selected, but I don't need items with class inactive
. And I have no option to not include them, only to filter via attributes map.
Do a find on those with ‘inactive’?
The only way to find them in the current version is like this:
attrs := showtimeBlock.Attrs()["class"]
if strings.Contains(attrs, "inactive") {
continue
}
@Salmondx You have a point. The best way right now is to revert the commit and look into the issue from another angle.
However, I feel like we should first discuss the significance of finding by a single class, and then proceed from there.
This is meant to emulate BeautifulSoup so I guess thats the importance 😄
@ghandic Exactly, but they have a pretty strange api: https://stackoverflow.com/questions/34356912/is-there-any-strict-findall-function-in-beautifulsoup/34357876
That may true but as I say it’s trying to emulate it in go so the API should be similar
@anaskhan96 do you have any ideas? I could make a PR if we discuss the API
I looked across other scrapers' implementation and concluded that the Find
function should remain as it is. This package was started to emulate the interface of BeautifulSoup (similar function names, conjoined function calls) as mentioned in the README, not how its functions work internally, so I'll stick with that.
However, this issue shall remain open for further discussion and input from other users.
May be we should make a Select
method with the desired functionality? Like in the link above.
It's a good idea, although we would have to discuss ways to parse the different kinds of string that Select
would receive as an argument.
I think that we should use a string as before:
soup.Select("div", "class", "class-1 class-2 class-3")
With strict matching
I think we should go about adding FindStrict
and FindAllStrict
which do the job, while retaining the interface similar to Find
and FindAll
.
soup.FindStrict("div", "class", "class-1 class-2 class-3")
Let's reserve Select
for a possible future implementation of BeautifulSoup's select
.
Ok, sounds good!
Currently
Find("a", "class", "message")
would only work if it was<a class="message"></a>
but would not work on<a class="message input-message"></a>
even though they are both of class message.Could this be added?