loklak / loklak_server

Distributed Open Source twitter and social media message search server that anonymously collects, shares, dumps and indexes data http://api.loklak.org
GNU Lesser General Public License v2.1
1.38k stars 223 forks source link

StringIndexOutOfBoundsException in eventbrite parser #677

Closed Orbiter closed 8 years ago

Orbiter commented 8 years ago

There was an example url in the pull request https://github.com/loklak/loklak_server/pull/676 which showed an error in the eventbrite parser.

try http://localhost:9000/api/console.json?q=SELECT%20*%20FROM%20eventbrite%20WHERE%20url=%27https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421%27%27;

causes

java.lang.StringIndexOutOfBoundsException: String index out of range: 19
    at java.lang.String.substring(String.java:1951)
    at org.loklak.api.search.EventBriteCrawlerService.crawlEventBrite(EventBriteCrawlerService.java:118)
    at org.loklak.api.search.ConsoleService.lambda$11(ConsoleService.java:330)
    at org.loklak.api.search.ConsoleService$$Lambda$15/746247411.apply(Unknown Source)
    at org.loklak.api.search.ConsoleService.console(ConsoleService.java:346)
    at org.loklak.api.search.ConsoleService.serviceImpl(ConsoleService.java:363)
    at org.loklak.server.AbstractAPIHandler.process(AbstractAPIHandler.java:162)
    at org.loklak.server.AbstractAPIHandler.doGet(AbstractAPIHandler.java:109)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:845)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1174)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:418)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1106)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
    at org.eclipse.jetty.server.handler.IPAccessHandler.handle(IPAccessHandler.java:219)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.eclipse.jetty.server.Server.handle(Server.java:524)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:319)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:253)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
    at java.lang.Thread.run(Thread.java:745)
Orbiter commented 8 years ago

however, for http://localhost:9000/api/console.json?q=SELECT%20*%20FROM%20eventbrite%20WHERE%20url=%27https://www.eventbrite.fr/e/billets-europeade-2016-concert-de-musique-vocale-25592599153?aff=es2%27; the parser works. Please analyse the difference.

jigyasa-grover commented 8 years ago

@Orbiter Fixed NPEs and other StringIndexOutOfBoundsException in eventbrite parser PR #680

The URL http://localhost:9000/api/console.json?q=SELECT%20*%20FROM%20eventbrite%20WHERE%20url=%27https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421%27%27; now yields ... simply because the event has passed .

{
  "data": [{"Event Brite Event Details": [
    {
      "creator": {
        "id": "1",
        "email": ""
      },
      "background_url": "",
      "social_links": [
        {
          "name": "Facebook",
          "id": "1"
        },
        {
          "name": "Twitter",
          "id": "2"
        }
      ],
      "end_time": "",
      "description": "",
      "privacy": "public",
      "type": "",
      "ticket_url": "https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421'#tickets",
      "event_url": "https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421'",
      "start_time": "",
      "location_name": "",
      "name": "",
      "logo": "",
      "topic": "",
      "id": "",
      "organizer_name": "",
      "state": "completed"
    },
    {
      "organizer_contact_info": "https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421'#lightbox_contact",
      "organizer_link": "https://www.eventbrite.com/e/?q=global-health-security-focus-africa-tickets-25740798421'#listing-organizer",
      "organizer_profile_link": "",
      "organizer_name": ""
    },
    [],
    [],
    [],
    [],
    [],
    [],
    []
  ]}],
  "metadata": {"count": 1},
  "session": {"identity": {
    "type": "host",
    "name": "127.0.0.1",
    "anonymous": true
  }}
}
jigyasa-grover commented 8 years ago

@Orbiter Closing this.