dadoonet / fscrawler

Elasticsearch File System Crawler (FS Crawler)
https://fscrawler.readthedocs.io/
Apache License 2.0
1.36k stars 299 forks source link

rest service hostname containing underscore yields error #474

Open shadiakiki1986 opened 6 years ago

shadiakiki1986 commented 6 years ago

This was when testing on docker service named fscrawler_rest. When running a simple curl call to the REST endpoint $ docker-compose exec elasticsearch1 curl "http://fscrawler_rest:8080" the following error is received

<html><head><title>Grizzly 2.3.28</title><style><!--div.header {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#003300;font-size:22px;-moz-border-radius-topleft: 10px;border-top-left-radius: 10px;-moz-border-radius-topright: 10px;border-top-right-radius: 10px;padding-left: 5px}div.body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:#FFFFCC;font-size:16px;padding-top:10px;padding-bottom:10px;padding-left:10px}div.footer {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#666633;font-size:14px;-moz-border-radius-bottomleft: 10px;border-bottom-left-radius: 10px;-moz-border-radius-bottomright: 10px;border-bottom-right-radius: 10px;padding-left: 5px}BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}B {font-family:Tahoma,Arial,sans-serif;color:black;}A {color : black;}HR {color : #999966;}--></style> </head><body><div class="header">Internal Server Error</div><div class="body"><b>java.net.URISyntaxException: Illegal character in hostname at index 16: http://fscrawler_rest:8080/</b><pre>     1: org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.getBaseUri(GrizzlyHttpContainer.java:457)
     2: org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:365)
     3: org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:224)
     4: org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593)
     5: org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573)
     6: java.lang.Thread.run(Thread.java:745)
</pre><b>Root Cause: java.net.URISyntaxException: Illegal character in hostname at index 16: http://fscrawler_rest:8080/</b><pre>     1: java.net.URI$Parser.fail(URI.java:2848)
     2: java.net.URI$Parser.parseHostname(URI.java:3387)
     3: java.net.URI$Parser.parseServer(URI.java:3236)
     4: java.net.URI$Parser.parseAuthority(URI.java:3155)
     5: java.net.URI$Parser.parseHierarchical(URI.java:3097)
     6: java.net.URI$Parser.parse(URI.java:3053)
     7: java.net.URI.<init>(URI.java:673)
     8: org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.getBaseUri(GrizzlyHttpContainer.java:455)
     9: org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:365)
    10: org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:224)
        ... 3 more</pre>Please see the log for more detail.</div><div class="footer">Grizzly 2.3.28</div></body></html>

Renaming the service from fscrawler_rest to fscrawlerrest fixes it

dadoonet commented 6 years ago

I did a simple test like:

    @Test
    public void testUri() throws URISyntaxException {
        new URI("http://fscrawler_test:80/");
        new URI("http", null, "fscrawler_test", 80, "/", null, null);
    }

The first is OK but the later fails with: java.net.URISyntaxException: Illegal character in hostname at index 16: http://fscrawler_test:80/

Which is kind of weird because the the later is generating the same string as the former... o_O

I'm digging more.

dadoonet commented 6 years ago

I think this is something that the JDK does not support according to some feedbacks I got on Twitter: https://twitter.com/dadoonet/status/952214871298461696

May be I should add a check when validating the settings before launching anything?

dadoonet commented 6 years ago

I think that VertX might support this better. I mean that the error comes here from Jersey and there is no other method AFAIK to start the server.

So I might fix that when #529 will be implemented.