Closed vignesh1507 closed 1 month ago
@vignesh1507 Thank you so much for your constructive points. Some of them have actually been resolved in the new versions that we're releasing soon (0.3.6(. Due to our move to async, we haven't done much on our synchronous version, and perhaps we will stop maintaining this part as well. Most of your comments will reside under that part. Anyway, we appreciate your input. Thank you so much. We will consider your suggestions in our future releasing versions. And thank you for your interest in our library.
Redundant kwargs in fetch_pages: The kwargs being passed in executor.map seem redundant, as they are being unpacked in the same format for every call. You can simplify this by passing **kwargs directly to the fetch_page_wrapper.
Potential for None Values in process_html: When calling process_html, if html is None (for instance, if the crawl fails), you may run into issues. Ensure that html is valid before passing it to process_html.
Missing import json: You use json.dumps in your code but haven't imported the json module. Make sure to add this import at the top:
import json