miraisolutions / spark-bigquery

Google BigQuery data source for Apache Spark
MIT License
17 stars 6 forks source link

Error with records that have a date of '01-01-0001' #12

Open david-walker-gfs opened 4 years ago

david-walker-gfs commented 4 years ago

ERROR:

com.miraisolutions.spark.bigquery.exception.IOException: BigQuery job JobId{project=gcp-gfs-datalake-process-tst, job=02f24e90-932f-4fec-8dcb-f86008e6a26a, location=US} failed with message: Error while reading data, error message: Invalid date value '-719164' for field 'vondt' of type 'DATE'

-719164 equates to a Julian date of 01-01-0001. Lots of SAP HANA tables use this date as a default date.

According to the BQ docs, 01-01-0001 is a valid date so I wouldn't have expected this to be an issue. Range for Date datatype: 0001-01-01 to 9999-12-31.

Ref: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date-type

spoltier commented 4 years ago

Performing the following query in the google cloud BigQuery web interface

SELECT DATE_FROM_UNIX_DATE(-719164)

results in a similar error: DATE value is out of allowed range: from 0001-01-01 to 9999-12-31

SELECT DATE_FROM_UNIX_DATE(-719162)

Results in 0001-01-01.

It looks like an off-by-n (in this case 2) issue when converting dates. Possibly we need move to a newer bigquery version.

@david-walker-gfs Any idea of some factor in your use case that could result in this type of issue ?

Edit to add: This is likely to be related to Excel or Excel compatibility; the difference is -719164 when using Excel compatible tools, but other environments (R, javascript console in Firefox, jvm (via the Scala REPL)) return -719162.

david-walker-gfs commented 4 years ago

This library is being used by a product called Attunity (Qlik) Compose. I have a support case open with them about this. there's a goo possibility that they are not doing something correctly. I do know the date is definitely '0001-01-01' on the source. I'll follow up with them and see if they can provide any further info.

Thanks for the information!