Open Abacn opened 1 year ago
Note: the following workaround can be used for columns involving Date (or Time) type, that is register your own logical type:
@LogicalType.register_logical_type
class DateType(LogicalType[datetime.date, MillisInstant, str]):
def __init__(self, unused=""):
pass
@classmethod
def representation_type(cls):
# type: () -> type
return Timestamp
@classmethod
def urn(cls):
return "beam:logical_type:javasdk:v1"
@classmethod
def language_type(cls):
return datetime.date
def to_representation_type(self, value):
# type: (datetime.date) -> Timestamp
return Timestamp.from_utc_datetime(datetime.datetime.combine(value, datetime.datetime.min.time(), tzinfo=datetime.timezone.utc))
def to_language_type(self, value):
# type: (Timestamp) -> datetime.date
return value.to_utc_datetime().date()
@classmethod
def argument_type(cls):
return str
def argument(self):
return ""
@classmethod
def _from_typing(cls, typ):
return cls()
I recall why the Date type support was incomplete. The Java JdbcIO implemented Date and Time with non-portable logical type backed by joda Instant, while the Beam portable Date and Time logical type are backed by more modern java.time.localDate or localTime. We cannot simply change the Java JdbcIO due to concern of breaking change. This involves two difficult compatibility issues
One fix could be done for transition is that adding a flag to IOs that guide them to produce java.time results and use portable logical types. This would enable Python implementation (and also eliminate the need of call LogicalType.register_logical_type(MillisInstant)
if need to overwrite logical type mapping.
The link provided in migrating from joda time to java.time points to a PR that is behind Google's corp firewall.
It would help if the issues were exposed to the OS community, since it's an OS issue.
Thanks, just updated the comment.
What needs to happen?
Currently there are still a couple of non-portable logical types defined in https://github.com/apache/beam/blob/926774dd02be5eacbe899ee5eceab23afb30abca/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/LogicalTypes.java
This prevents cross-lang JdbcIO to read / write rows with these types. Most commonly used are Date and Time types. We should migrate them to portable logical types and also support them in Python side.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components