code4craft / webmagic

A scalable web crawler framework for Java.
http://webmagic.io/
Apache License 2.0
11.37k stars 4.18k forks source link

Refactored to remove multiple calls of getSourceTexts() api #1137

Closed harikrishna553 closed 9 months ago

harikrishna553 commented 9 months ago

a. AbstractSelectable#getFirstSourceText() method is calling getSourceTexts() multiple times, not required to address this usecase b. Remove unnecessary else block in AbstractSelectable#get() method. c. AbstractSelectable#match method is calling getSourceTexts() multiple times, not required to address this usecase.