ynqa / pandavro

Apache Avro <-> pandas DataFrame
MIT License
134 stars 32 forks source link

fix read_avro kwargs + cleanup tests #41

Closed marctorsoc closed 1 year ago

marctorsoc commented 2 years ago

This PR solves two issues:

  1. In a previous PR, the columns kwarg of from_records was promoted to read_avro to save RAM. However, this meant that all rows were read, ignoring nrows. Same for exclude to exclude columns. Here this is fixed
  2. Tests were flaky and not working for me locally. This was due to an unspecified timezone, which makes avro dump as UTC but then when reading it infers what it can. I hypothesize this might be working since the tests are run in UTC located machines, but I hope this will work now for any place.

Let me know your thoughts

marctorsoc commented 2 years ago

@ynqa and @ruben-trdj for exposure

deleted commented 1 year ago

FWIW, I can confirm from local testing (Ubuntu 20.04) that:

ynqa commented 1 year ago

Once I thought to use https://github.com/spulec/freezegun for mocking or something, but it is also enough now. Thanks.