Closed kevinjqliu closed 1 week ago
Hi @kevinjqliu thanks for bouncing on this! And nice catch on the commit_table
functions... this looks good to me. It looks like python3.12 CI is failing due to a DeprecationWarning on utcfromtimestamp
. Could we rebase this branch and run the CI again?
Related fix: https://github.com/apache/iceberg-python/pull/1134/files
super weird that this is only happening to 3.12
self = TimestampNTZType(), ts = 1672601100000000
def fromInternal(self, ts: int) -> datetime.datetime:
if ts is not None:
# using int to avoid precision loss in float
> return datetime.datetime.utcfromtimestamp(ts // 1000000).replace(
microsecond=ts % 1000000
)
E DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC).
which corresponds to pyspark's use of datetime.datetime.utcfromtimestamp
, which was recently changed https://github.com/apache/spark/commit/8e02a6493ef5dc5949e161179a7c081c5ca58ff2#diff-bb02965370d515ec7add7967bc949c18c88ffc7346cceaebefb84907dfdce903L437
We remove the warning in #1134, but the tests passed then
datetime.datetime.utcfromtimestamp
is deprecated starting from 3.12
https://docs.python.org/3.12/library/datetime.html#datetime.datetime.utcfromtimestamp
@kevinjqliu Yes a newer version of Python will emit a deprecation warning, I'm okay with putting back the exclusion to fix that in another PR
opened #1349 and added back the warning filter, we can remove once we upgrade PySpark to 4.0
Closes #1336, related to #1327 This PR changes the implementation of the
Table.name
function to useself._identifier
instead ofself.identifier
to avoid having unnecessary deprecation warnings. This PR also replaces all references ofTable.identifier
withTable.name()
to prepare for the deprecation ofTable.identifier
and avoid emitting warnings in tests. Using VSCode, I confirmed all use ofTable.identifier
is replaced in the repo.