pattersonkl / google-refine

Automatically exported from code.google.com/p/google-refine
0 stars 0 forks source link

Sort as date fails with no error message on misformed data #257

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
I'll need to try and figure out what the bad data was, but logging this now 
just so it doesn't get lost.  I sorted on a column and clicked the "Date" 
option in the sort dialog.  The browser page went into it's "spin forever" mode 
since the server request died.

07:30:13.173 [                  command] Exception caught (53ms)
java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Date
    at com.google.refine.sorting.DateCriterion$1.compareKeys(DateCriterion.java:64)
    at com.google.refine.sorting.BaseSorter$ComparatorWrapper.compare(BaseSorter.java:101)
    at com.google.refine.sorting.BaseSorter.compare(BaseSorter.java:169)
    at com.google.refine.sorting.SortingRowVisitor$1.compare(SortingRowVisitor.java:85)
    at com.google.refine.sorting.SortingRowVisitor$1.compare(SortingRowVisitor.java:75)
    at java.util.Arrays.mergeSort(Arrays.java:1293)
    at java.util.Arrays.mergeSort(Arrays.java:1282)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.mergeSort(Arrays.java:1281)
    at java.util.Arrays.sort(Arrays.java:1210)
    at java.util.Collections.sort(Collections.java:159)
    at com.google.refine.sorting.SortingRowVisitor.end(SortingRowVisitor.java:75)
    at com.google.refine.browsing.util.ConjunctiveFilteredRows.accept(ConjunctiveFilteredRows.java:68)
    at com.google.refine.commands.row.GetRowsCommand.internalRespond(GetRowsCommand.java:130)
    at com.google.refine.commands.row.GetRowsCommand.doPost(GetRowsCommand.java:68)
    at com.google.refine.RefineServlet.service(RefineServlet.java:170)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
    at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:326)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:938)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

Original issue reported on code.google.com by tfmorris on 26 Nov 2010 at 5:57

GoogleCodeExporter commented 8 years ago
I've fixed one possible cause, but I was unable to reproduce the problem first. 
 My thesis is that the makeKey method was falling through and returning the 
original String instead of a Date.

Original comment by tfmorris on 27 Nov 2010 at 1:57

GoogleCodeExporter commented 8 years ago
I have a similar case with an import of an initial JSON file, and the column 
did show green numbers, except a few where "null" values.  The "null" strings 
values in the Freebase "type" namespace I used edit in the facet by text to 
blankout the "null" values to nothing.  When I ran Sort, and choose "numbers" 
radio button, things kinda froze and I have attached a log showing the 
exception in this particular case.  The JSON file was the actual schema dump of 
/common types in Freebase with the number value being the 
/freebase/type_profile/instance_count just as in the query [1].  I then choose 
to transform cells to numbers in that column, and then the Sort by numbers 
worked fine after that.

So, I guess it would be damn skippy happy, to have a warning to the user that 
"hey you have still some string values in this column, you might want to choose 
transform cells to numbers, and run this sort again. You can always undo, if we 
got this assumption wrong."
It was a general lockup that could be avoided, I think, with more error 
handling and warnings to the user.

1.

{
  "id":      "/",
  "type":    [],
  "name":    null,
  "creator": null,
  "/type/namespace/keys": [{
    "value":  null,
    "namespace": {
      "id":            null,
      "name":          null,
      "type":          [],
      "timestamp":     null,
      "/freebase/domain_profile/hidden": null,
      "/type/domain/types": [{
        "id":            null,
        "/freebase/type_profile/instance_count": null,
        "limit":         1000
      }]
    }
  }]
}​

Original comment by thadguidry on 26 Aug 2011 at 4:24

Attachments: