The UNIX time conversion depends on a system time zone. For example, PySpark's F.unix_timestamp can produce different results for users in different time zones. One can achieve reproducible results by adding spark.conf.set("spark.sql.session.timeZone", "UTC") at the beginning of the data processing script. The same behavior might occur in Pandas data preprocessing, but I didn't check it.
The UNIX time conversion depends on a system time zone. For example, PySpark's
F.unix_timestamp
can produce different results for users in different time zones. One can achieve reproducible results by addingspark.conf.set("spark.sql.session.timeZone", "UTC")
at the beginning of the data processing script. The same behavior might occur in Pandas data preprocessing, but I didn't check it.