PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.46k stars 769 forks source link

`IntoPy` does not properly respect `chrono_tz::Tz` timezones #3266

Open grantslatton opened 1 year ago

grantslatton commented 1 year ago

The IntoPy implementation for chrono::DateTime is implemented generically for any T: TimeZone by converting everything to fixed offsets. This means it loses the Tz information in a timezone like America/Los Angeles and just converts it to UTC-8 which is a lossy conversion (due to daylight savings, etc).

PyO3 should probably provide two different implements, one for DateTime<FixedOffset> and another for DateTime<chrono_tz::Tz> that creates a python ZoneInfo object.

This is technically a bug since this should be a lossless conversion but I have filed it under feature request since it doesn't match the bug template very well.

adamreichold commented 1 year ago

@Psykopear @pickfire Do you have the bandwidth to look into this?

grantslatton commented 1 year ago

FWIW I wound up implementing this in my own code like:

pub fn get_canonical_tz_py_obj_kwargs(tz: Tz) -> Py<PyDict> {
    lazy_static! {
        static ref TIMEZONES: HashMap<Tz, Py<PyDict>> = {
            let code = r#"
import zoneinfo
all_zones = zoneinfo.available_timezones()
zone_dict = {}
for zone_name in all_zones:
    zone = zoneinfo.ZoneInfo(zone_name)
    zone_dict[zone_name] = zone
            "#.trim();
            Python::with_gil(|py| {
                let globals = PyDict::new(py);
                py.run(code, Some(globals), None).unwrap();
                let zone_dict = globals.get_item("zone_dict").unwrap().extract::<HashMap<String, Py<PyAny>>>().unwrap();
                zone_dict
                    .into_iter()
                    .map(|(mut name, zone)| {
                        // Weird idiosyncratic "factory reset" type zone that python has but rust does not
                        if name == "Factory" {
                            name = "UTC".to_string();
                        }
                        let kwargs = [("tzinfo", zone)].into_py_dict(py);
                        (Tz::from_str(&name).unwrap(), kwargs.into_py(py))
                    })
                    .collect()
            })
        };
    }

    TIMEZONES.get(&tz).unwrap().clone()
}

and then to convert the DateTime<Tz> to PyAny:

let kwargs = get_canonical_tz_py_obj_kwargs(dt.timezone());
let datetime = dt.naive_local();
let datetime = datetime.into_py(py);
let result = datetime.call_method(py, "replace", (), Some(kwargs.as_ref(py))).unwrap();

You can probably think of something more pythonic or efficient here, but if not, feel free to steal from this

elied commented 1 week ago

Hello! I have recently started using Rust and PyO3 and have come across this issue. I'm simply using a custom conversion function in my code to deal with it for now, but I'd love to see it addressed in the official repo.

Does someone have a general outline of what the solution might look like here? If not, I've shared my thoughts below.

I'm still learning Rust so I might be off-base here, but based on my understanding the core issue here was that PyO3 didn't want to assume which timezone implementation was being used in Python (makes sense).

FromPyObj doesn't have that issue because it can use the standard PyTzInfo abstract class to get the time zone info regardless of which implementation was used, but ToPyObj needs to know which concrete implementation to use.

However, the chrono_tz feature requires you to have Python >=3.9 and assumes you want to use ZoneInfo as your time zone implementation.

So it seems like the simplest (though maybe not cleanest?) path forward here would be to update the conversions for DateTime::ToPyObject in chrono.rs based on whether the chrono_tz feature is enabled. If it's disabled, we keep using the current approach with offsets (because what else can you do?). But if it's enabled, we can assume that it's safe to use ZoneInfo and convert accordingly.

What do others think of this plan? I'd think that anyone using the chrono_tz feature would welcome a change like this, and it should be unchanged for everyone else.

davidhewitt commented 1 week ago

Help would be welcome here. I think you're in the right direction, though it might be simpler than you fear.

I think this implementation

https://github.com/PyO3/pyo3/blob/06d2affe0897c893498a262242b77900dc366cd6/src/conversions/chrono.rs#L439

needs to be changed so that instead of converting the timezone to fixed offset, it should instead require the timezone type implement IntoPyObject

impl<'a, 'py, Tz: TimeZone> IntoPyObject<'py> for &'a DateTime<Tz>
  where &'a Tz: IntoPyObject<'py> { 

... and then the Tz type can choose what it becomes, whether that's utc, zoneinfo or some other thing.

elied commented 1 week ago

Ah, I wasn't sure about an approach like that since it seemed like more of a breaking change since we'd be ditching the default implementation with offsets entirely (unless I'm misunderstanding).

I agree that would be a cleaner implementation though! If you are okay with that direction, I can work on a PR in the next couple of days.