ldbc / ldbc_snb_docs

Specification of the LDBC Social Network Benchmark suite
https://ldbcouncil.org/ldbc_snb_docs/
Apache License 2.0
53 stars 14 forks source link

Query 1 return types do not match driver expected return types #34

Closed tomersagi closed 10 years ago

tomersagi commented 10 years ago

hi, According to: [https://github.com/ldbc/ldbc_driver/blob/master/src/main/java/com/ldbc/driver/workloads/ldbc/snb/interactive/LdbcQuery1Result.java] The birthday and creation date should be long, but according to the documentation they should be Date and DateTime respectively. If this is indeed to be long, what long are you expecting? Thanks

alexaverbuch commented 10 years ago

You're right, the driver needs to be updated.

It is on our (LDBC) roadmap to decide on a Date/DateTime format, but in the interim we are using http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#currentTimeMillis()

I'll leave this issue open, as it really needs to be made a priority, but for now please treat it as above.

Is this doable for now? Really sorry about the inconvenience, we have a lot to do at the moment but I'll try to get this prioritized.

tomersagi commented 10 years ago

I'm assuming you mean [http://docs.oracle.com/javase/7/docs/api/java/util/Date.html#getTime()] ?

alexaverbuch commented 10 years ago

You're right. System.currentTimeMillis() == Date.getTime(), but only for RIGHT NOW :)

tomersagi commented 10 years ago

Hmph, I 'm getting validation errors on the birthday field due to hour differences. I'm using UTC to get the timestamp from the date, what are you using?

alexaverbuch commented 10 years ago

what do you mean by "how are you using"?

alexaverbuch commented 10 years ago

FYI, https://github.com/ldbc-dev/ldbc_snb_workload_interactive_neo4j/blob/master/data-import/src/main/java/com/ldbc/snb/interactive/neo4j/load/LdbcSocialNetworkCsvFileInserters.java#L18L21

tomersagi commented 10 years ago

Great, that’s exactly what I was looking for. I was using JODA which is TimeZone aware.

tomersagi commented 10 years ago

Hmph. Still getting validation errors. I think we still have a time zone issue. This is what I'm doing:

  1. Parsing the Date / DateTime using the constants and functions from the example you gave.
  2. Loading the data into the graph as a Date object.
  3. Running the query and parsing the result Date (Birthdate / CreationDate) using .getTime() to a long value.
  4. Groaning at this error: Expected Result: [LdbcQuery1Result{friendId=5497558221127, friendLastName='Burns', distanceFromPerson=2, friendBirthday=434674800000, friendCreationDate=1292245449000, ... Actual Result : [LdbcQuery1Result{friendId=5497558221127, friendLastName='Burns', distanceFromPerson=2, friendBirthday=434671200000, friendCreationDate=1292241849000

Looks like an hour's difference, maybe daylight savings time?

Ideas? Thanks Tomer

alexaverbuch commented 10 years ago

Hi Tomer, You seem to have found a bug. I think the reason that this issue has not popped up internally to LDBC is we are in the same timezone... frustrating.

I just looked in the persons .csv outputted by the data generator and saw entries like this: 2010-11-21T23:04:05Z This is incorrect, as the Z should not be a literal 'Z', it should be the timezone. Moreover, I don't know what the T is?

I suspect what is happening is the data generator is doing this when serializing:

        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'");
        Date birthday = // assign date here
        sdf.format(birthday);

===> 2014-10-07T16:59:03Z

When I think it should be doing this:

        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-ddHH:mm:ssZ");
        sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
        Date birthday = // assign date here
        sdf.format(birthday);

===> 2014-10-0714:59:03+0000

More about it here: http://stackoverflow.com/questions/19112357/java-simpledateformatyyyy-mm-ddthhmmssz-gives-timezone-as-ist

@ArnauPrat thoughts?

And, side note, we should be publishing things like TimeZone and Charset in our spec.

Thanks so much for all the bug reports Tomer.

ArnauPrat commented 10 years ago

Hi @alexaverbuch You are right, datagen is doing what you said. Dou you think your proposal will solve the problem?

alexaverbuch commented 10 years ago

When I do this:

        Date originalBirthday = new Date();

        SimpleDateFormat gmtDateFormatter = new SimpleDateFormat("yyyy-MM-ddHH:mm:ssZ");
        gmtDateFormatter.setTimeZone(TimeZone.getTimeZone("GMT"));
        System.out.println("GMT TimeZone = "+gmtDateFormatter.getTimeZone());

        SimpleDateFormat gmt10DateFormatter = new SimpleDateFormat("yyyy-MM-ddHH:mm:ssZ");
        gmt10DateFormatter.setTimeZone(TimeZone.getTimeZone("GMT+10"));
        System.out.println("GMT+10 TimeZone = " + gmt10DateFormatter.getTimeZone());

        String gmtFormattedDate = gmtDateFormatter.format(originalBirthday);

        Date gmtBirthday = gmtDateFormatter.parse(gmtFormattedDate);
        Date gmt10Birthday = gmt10DateFormatter.parse(gmtFormattedDate);

        String gmtFormattedDate2 = gmtDateFormatter.format(gmtBirthday);
        String gmt10FormattedDate2 = gmt10DateFormatter.format(gmt10Birthday);

        System.out.println("gmtFormattedDate = "+gmtFormattedDate);
        System.out.println("gmtFormattedDate2 = " + gmtFormattedDate2);
        System.out.println("gmt10FormattedDate2 = "+gmt10FormattedDate2);

The output is:

    GMT TimeZone = sun.util.calendar.ZoneInfo[id="GMT",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
    GMT+10 TimeZone = sun.util.calendar.ZoneInfo[id="GMT+10:00",offset=36000000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]
    gmtFormattedDate = 2014-10-0716:18:30+0000
    gmtFormattedDate2 = 2014-10-0716:18:30+0000
    gmt10FormattedDate2 = 2014-10-0802:18:30+1000
alexaverbuch commented 10 years ago

So I think yes. This has much in common with the Charset issue I've been dealing with recently (e.g., making sure UTF-8 is used everywhere). We need to:

  1. Agree on a TimeZone
  2. Agree on a DateTime format (I think it's fine now, minus the 'Z' issue)
  3. Make sure times are always exported in that format
  4. Document that TimeZone

Otherwise users will encounter issues due to time conversation during the parsing phase, as @tomersagi has reported.

ArnauPrat commented 10 years ago

@alexaverbuch : OK, we speak and agree in the steps to follow

ArnauPrat commented 10 years ago

@alexaverbuch I would just use GMT for the time format (to choose one), and document this in the spec (I'll do so). Do you agree?

tomersagi commented 10 years ago

Bah, still having issues with this. Date comparison is ok now, but datetime is still off. Expected Result: [... friendBirthday=388454400000, friendCreationDate=1268990052422 Actual Result : [...friendBirthday=388454400000, friendCreationDate=1268990462000

Here is the code I'm using to load the data to the graph:

private final static String DATE_TIME_FORMAT_STRING = "yyyy-MM-dd'T'HH:mm:ss.sss+0000";
private final static SimpleDateFormat DATE_TIME_FORMAT = new SimpleDateFormat(DATE_TIME_FORMAT_STRING);
private final static String DATE_FORMAT_STRING = "yyyy-MM-dd";
private final static SimpleDateFormat DATE_FORMAT = new SimpleDateFormat(DATE_FORMAT_STRING);
...
DATE_TIME_FORMAT.setTimeZone(TimeZone.getTimeZone("GMT"));
DATE_FORMAT.setTimeZone(TimeZone.getTimeZone("GMT"));
...
if (entry.length() > 10) {
                dateTime = DATE_TIME_FORMAT.parse(entry);
            } else {
                dateTime = DATE_FORMAT.parse(entry);
            }

And here is the code used for retrieval:

...((Date) v.getProperty("birthday")).getTime(), 
((Date) v.getProperty("creationDate")).getTime(),...

My guess is that the Date.getTime() is using GMT+2 instead of GMT. Any ideas how to fix it? Thanks,

alexaverbuch commented 10 years ago

@tomersagi, Please change DATE_TIME_FORMAT_STRING = "yyyy-MM-dd'T'HH:mm:ss.sss+0000" to DATE_TIME_FORMAT_STRING = "yyyy-MM-dd'T'HH:mm:ss.SSSZ" and then try again

tomersagi commented 10 years ago

Fixed, thanks!