Create new funcionality for WebDataScrapper

aureliowozhiak / DLaaS

Data Lake as a Service

23 stars 6 forks source link

Create new funcionality for WebDataScrapper #38

Closed Priscaruso closed 1 year ago

Priscaruso commented 1 year ago

This issue will add a new funcionality for the WebDataScrapper extractor which will execute the following:

Get the html links of all pages

Priscaruso commented 1 year ago

I did a modification of the first intended functionality because the majority of pages are SPA now. Instead of getting all pages link, it gets all the internal links from the page. Using these links as arguments for the class WebPageDataScrappers, we can navigate through all of them.

aureliowozhiak commented 1 year ago

It's all done? Can we close this issue @Priscaruso ?

Priscaruso commented 1 year ago

@aureliowozhiak I was thinking about trying a recursion method, but I don't know if it will work as we are making generic methods. We can close for now.

aureliowozhiak commented 1 year ago

Okay, so we can create a new issue for this recursion method. @Priscaruso