Closed Rafiot closed 1 year ago
Just a note: in order to do something similar as what scrappy allows when crawl a page at a specific depth, here is the default parser: https://github.com/scrapy/scrapy/blob/master/scrapy/linkextractors/lxmlhtml.py
This is extremely similar to what we do in har2tree: https://github.com/Lookyloo/har2tree/blob/main/har2tree/nodes.py#L425
Done: https://github.com/Lookyloo/PlaywrightCapture/commit/d1dcd16ce10a956e4cae6777ca5a564bec132868
Just a note: in order to do something similar as what scrappy allows when crawl a page at a specific depth, here is the default parser: https://github.com/scrapy/scrapy/blob/master/scrapy/linkextractors/lxmlhtml.py
This is extremely similar to what we do in har2tree: https://github.com/Lookyloo/har2tree/blob/main/har2tree/nodes.py#L425