Open amalic opened 6 years ago
Why do you think we need that?
And wouldn't the last two lines disallow nanopub servers to fetch nanopubs from each other?
Rationale is to prevent irrelevant webcrawlers generating tons of traffic.
Nanopubs-server would not be affected since Nanopubs-server does not process robots.txt files. It's more for webcrawlers, at least the ones that are behaving and respecting the content of robots.txt.
The above file is just an example (i think it was facebook.com/robots.txt)
OK, I see, but I think we should define a robot.txt that we ourselves are also respecting. And it should be allowed to write scripts to retrieve nanopubs, for example. I wouldn't want to declare this to be illegitimate.
Have you already experienced this problem of traffic by irrelevant webcrawlers with your server, or is this more of a preventive measure for the future?
Let's do this.
Nanopubs currently identifies as Apache-HttpClient/4.3.4 (java 1.5)
. Shall we do simply "Nanopubs/1.0"?
e.g.: