Closed liweijian closed 7 years ago
Hi,
There are at least two options, depending on what you want.
attribute
or R.attribute
gives you the raw text of the class
attribute:
"<p class='Hello Hey'>World!</p>" |> parse $ "p" |> R.attribute "class";;
- : string = "Hello Hey"
The difference between the two is that attribute
returns an option, so it will be Some "Hello Hey"
above, and None
if the attribute is absent; while R.attribute
will throw an exception if the attribute is missing (R
stands for "require").
classes
gives you the list of classes found in the class
attribute:
"<p class='Hello Hey'>World!</p>" |> parse $ "p" |> classes;;
- : string list = ["Hello"; "Hey"]
Thank you for your quick reply, actually what I want is to get the text of p
element by class in a large html document.
Finally I got the answer
"<p class='Hello Hey'>World!</p>" |> parse $ ".Hello.Hey" |> R.leaf_text;;
Ah, yes, I see what you mean now :)
The only thing I would add is that if your <p>
element can have child elements, you may want to do
(* ... *) $ ".Hello.Hey" |> texts |> String.concat ""
http://aantron.github.io/lambda-soup/#VALtexts
I'm actually not certain leaf_text
is a good idea to even have in the API, but it's there...
@aantron
I am sorry to bother you again, I was wondering how may I using lambda soup to getElementById()
?
<div id='one'> 11</div><div id='two'> 22 </div>
For example, I want to get the text of some specific id
?
No worries, this is not bothering :)
You can do it like this:
let soup = parse "<div id='one'> 11</div><div id='two'> 22 </div>") in
soup $ "#two" |> R.leaf_text
(I've split the code up into two lines, compared to before). This gives
- : string = " 22 "
In general, you may want to refer to the list of CSS selectors, whether here, or in your favorite CSS tutorial :)
According to Readme, we could get the text of some
class
byI was wondering how to get the text of a classes list? For example