Details

Right now we just assume everything is scrapable on the web, which breaks some rules. We're a small open source team so not a big deal yet. This ticket is create a new service that the BookmarkService.java can use to determine if a URL is scrapable on top of the checks from the user.

Requirements

[ ] needs #242
[ ] Create a new Java class WebCheckService.java (Or a better name).
[ ] Scaffold the following functions:
- public boolean isScrapable(String url) { }
- public String getRobotsTxt() { }
- public List parseRobots { return List}
[ ] Add scaffolding for tests

I have tried to lay out the scaffold functions as train of thought how the service might work. Open to more ideas and artistic liberties BookmarkService calls isScrapable(url) when a user claims a Bookmark is scrapable in the request, i.e. we confirm this -> { getRebotsTxt(); parseRobots(); // read through the list of RobotAgent. // ensure that the path we are scraping is public. return true or false. }

R-Sandor / FindFirst

[Server] Create a new service WebCheckService.java #241

Details

Requirements