jeroen / mongolite

Fast and Simple MongoDB Client for R
https://jeroen.github.io/mongolite/
287 stars 65 forks source link

Query dates before 1970 returns ".000Z" #171

Open LucasMelo3 opened 5 years ago

LucasMelo3 commented 5 years ago

The date is correctly inserted into mongoDB, where I can verify using the MongoDB Compass. Its: 1965-02-19 23:00:00.000 (date Format)

When I query the document which contains the date, the date comes as ".000Z". After looking in the internet and doing some tests, I found out the reason is because the dates are before 1970. Over 1970 it works perfectly.

There is any solution for this problem? I really need to query dates before 1970 which are in my database. Thank you very much for attention!

image

jeroen commented 5 years ago

I don't understand, can you include an example with code?

LucasMelo3 commented 5 years ago

@jeroen Thank you for your fast reply !

I found out that the bug came when using mongo options(date_as_char = TRUE), so dates under 1970 would return as ".000Z". Please find below the working example in R:

library(mongolite)
  MyCollection <- mongo(collection = "MyCollection", db = "MyDB",
                   url = "myServerURL")  

MyCollection$insert('{
"Start":{"$date": "1965-02-19T23:00:00Z"},              
"End":{"$date": "2010-07-12T22:00:00Z"},   
"Group":"SS3"
}')

MyCollection$find('{"Group":"SS3"}')

mongo_options(date_as_char = TRUE)

MyCollection$find('{"Group":"SS3"}')

Note that in the first query the dates comes as expected, but when the choice is to get them in string format, and the second query is done, the bug occurs.

Thank you for your attention!

stennie commented 5 years ago

@LucasMelo3 Can you include more details on your environment in R as reported by sessionInfo() with mongolite loaded?

I just tried your example with Mongolite 2.0 and R 3.5.2 on MacOS (x86_64) and couldn't reproduce the error:

> library(mongolite)

> MyCollection <- mongo(collection = "MyCollection", db = "MyDB")

> MyCollection$find('{"Group":"SS3"}')
                Start                 End Group
1 1965-02-20 09:00:00 2010-07-13 08:00:00   SS3

> mongo_options(date_as_char = TRUE)
$log_level
[1] "INFO"

$bigint_as_char
[1] FALSE

$date_as_char
[1] TRUE

> MyCollection$find('{"Group":"SS3"}')
                     Start                      End Group
1 1965-02-19T23:00:00.000Z 2010-07-12T22:00:00.000Z   SS3

Regards, Stennie

LucasMelo3 commented 5 years ago

@stennie Thanks for your interest in helping!

My session info:

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 
[2] LC_CTYPE=Portuguese_Brazil.1252   
[3] LC_MONETARY=Portuguese_Brazil.1252
[4] LC_NUMERIC=C                      
[5] LC_TIME=Portuguese_Brazil.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

other attached packages:
[1] mongolite_2.0

loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0    yaml_2.1.19   
[4] jsonlite_1.5   openssl_1.0.1

And I tried it again and still same error.

stennie commented 5 years ago

@LucasMelo3 Unfortunately this appears to be a limitation of the standard C runtime library on Windows. The Windows implementation of the gmtime() function (as used by date_as_char in Mongolite) currently does not support dates earlier than 1970:

If sourceTime represents a date before midnight, January 1, 1970, gmtime returns NULL. There is no error return.

Your error is reproducible on Windows 10 with R 3.5.2 and Mongolite 2.0.

I'm not sure what the expected code fix should be for Windows.

As a workaround, you could format dates as ISO strings using the server-side $dateToString aggregation expression:

MyCollection$aggregate('[
    { "$match": {
        "Group":"SS3"
    }},
    { "$project": {
        "Start": { "$dateToString": { "format": "%Y-%m-%dT%H:%M:%S.%LZ", "date": "$Start" }},
        "End":   { "$dateToString": { "format": "%Y-%m-%dT%H:%M:%S.%LZ", "date": "$End"   }}        
    }}
]')

Note: if you are using MongoDB 4.0 or newer, the format specification is optional and defaults to "%Y-%m-%dT%H:%M:%S.%LZ (which I used explicitly in my example above).

Regards, Stennie

jeroen commented 5 years ago

Thanks @stennie for debugging! Does the C driver provide alternatie method to print dates to strings that does not rely on gmtime?

LucasMelo3 commented 5 years ago

@stennie thanks for providing a workaround, but unfortunately your fix is not feasible since I have to rewrite tons of complex nested queries from my work system that is already online.

But if I upgrade to MongoDB 4.0, can keep the query as they are?

MyCollection $ find ('{"Group": "SS3"}')

without the mongo_options (date_as_char = TRUE), and will it work?

EDIT: I tried with mongoDB 4.0.5 and still not workig, the default date format comes as: POSIXct[1:1], format: "1965-02-19 21:00:00"

stennie commented 5 years ago

Does the C driver provide alternatie method to print dates to strings that does not rely on gmtime?

@jeroen It looks like the C driver currently doesn't have any relevant public functions. libbson has some private functions in bson-iso8601.c which are used when converting BSON dates from/to Extended JSON. For going from BSON to Extended JSON, _bson_iso8601_date_format() relies on gmtime_s and strftime(), so seems likely to have the same issue on Windows.

One of my colleagues suggested to fallback to SYSTEMTIME using the Windows API instead of C API: https://docs.microsoft.com/en-us/windows/desktop/Intl/retrieving-time-and-date-information (GetTimeFormatEx, GetDateFormatEx).

I've also raised a feature suggestion in the MongoDB issue tracker as this seems like a useful helper function: CDRIVER-2945: Add public function for converting from BSON dates to ISO8601 strings.

Regards, Stennie

stennie commented 5 years ago

thanks for providing a workaround, but unfortunately your fix is not feasible since I have to rewrite tons of complex nested queries from my work system that is already online.

@LucasMelo3 If changes to your application queries are not feasible, the only short term workaround I can think of would be to deploy your R application code on an O/S that doesn't have this issue (for example, a 64-bit Linux server).

Another alternative (but one requiring code changes) would be to avoid using the date_as_char option and instead convert dates to ISO strings where required using an R package.

I upgrade to MongoDB 4.0, can keep the query as they are?

No. The problem is specific to Windows' C runtime library (where your R client code is running), not the MongoDB server version. My comment on default date formats in MongoDB 4.0 was referring to the$dateToString aggregation expression used in the workaround.

Regards, Stennie

LucasMelo3 commented 5 years ago

Thanks @stennie!

Changing the O / S would be nice, but as the application image builds on each update replica my computer's O / S development environment, and I develop on windows, I would need to create a VM with linux and start developing there what can provoke more issues...

I will try to avoid using the date_as_char option and debug whatever comes.