voxel51 / fiftyone

The open-source tool for building high-quality datasets and computer vision models
https://fiftyone.ai
Apache License 2.0
7.93k stars 521 forks source link

Fix DateField for GMT+ users #4371

Closed swheaton closed 2 months ago

swheaton commented 2 months ago

What changes are proposed in this pull request?

Closes #4365

When the DB is not made timezone aware, it is not UTC by default but rather timezone agnostic. So when you call astimezone(pytz.utc) it assumes local timezone in order to do the conversion. When timezone is GMT-2, the stored date (2023, 9, 8, 0, 0) becomes (2023, 9, 8, 2, 0) which results in the same date so you don't see a difference. When timezone is GMT+2, like the reporting user, (2023, 9, 8, 0, 0) becomes (2023, 9, 7, 22, 0) which is a different date (2023/09/07) than expected!

  1. Use db.with_options() to apply timezone awareness of "UTC" if that is specified in fo.config. A datetime with timezone UTC is not the same as a datetime with no timezone.
    • Would love to change the default behavior as well but it would be a potential behavior change whereas I believe the above is strictly a bug fix.
  2. Remove conversion to UTC in Date field which will do the wrong thing in the case that the datetime is not timezone aware. I don't know why we wanted to convert to UTC in the first place. See inline PR comments.

How is this patch tested? If it is not, please explain why.

I am not sure how to test because you have to fake localized timezone. There's probably a way to do that.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

Fixed :class:`fiftyone.core.fields.DateField` for users in GMT+ timezones when `fo.config.timezone` is unset.

What areas of FiftyOne does this PR affect?

Summary by CodeRabbit

coderabbitai[bot] commented 2 months ago

Walkthrough

The recent updates focus on refining the handling of date and timezone data within the FiftyOne framework. Changes involve simplifying datetime to date conversion and streamlining timezone checks in the database logic.

Changes

File Path Change Summary
.../core/fields.py Simplified datetime to date conversion by removing UTC conversion in to_python method.
.../core/odm/database.py Modified timezone handling by removing specific checks for "utc" and ensuring logic handles unset zones.
.../unittests/dataset_tests.py Added import for time module and changed system time zone temporarily to "Europe/Madrid" for testing.

Assessment against linked issues

Objective Addressed Explanation
Correct DateField value from MongoDB (#4365) The changes do not directly address the issue with DateField values being incorrectly loaded, which might be related to timezone handling not specifically revised for MongoDB data retrieval.

Poem

In the land of code where the data hops around,
A rabbit tweaked the time, so no bug is found.
Simplify and prune, make the codebase light,
Each line refined, each function right.
🌟🐇 Happy coding, let the dates align! 🌟

  • CodeRabbit

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share - [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai) - [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai) - [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai) - [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)
Tips ### Chat There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai): - Review comments: Directly reply to a review comment made by CodeRabbit. Example: - `I pushed a fix in commit .` - `Generate unit testing code for this file.` - `Open a follow-up GitHub issue for this discussion.` - Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples: - `@coderabbitai generate unit testing code for this file.` - `@coderabbitai modularize this function.` - PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples: - `@coderabbitai generate interesting stats about this repository and render them as a table.` - `@coderabbitai show all the console.log statements in this repository.` - `@coderabbitai read src/utils.ts and generate unit testing code.` - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.` Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. ### CodeRabbit Commands (invoked as PR comments) - `@coderabbitai pause` to pause the reviews on a PR. - `@coderabbitai resume` to resume the paused reviews. - `@coderabbitai review` to trigger a review. This is useful when automatic reviews are disabled for the repository. - `@coderabbitai resolve` resolve all the CodeRabbit review comments. - `@coderabbitai help` to get help. Additionally, you can add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed. ### CodeRabbit Configration File (`.coderabbit.yaml`) - You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository. - Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information. - If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json` ### Documentation and Community - Visit our [Documentation](https://coderabbit.ai/docs) for detailed information on how to use CodeRabbit. - Join our [Discord Community](https://discord.com/invite/GsXnASn26c) to get help, request features, and share feedback. - Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.
swheaton commented 2 months ago

LGTM. I have no explanation/defense for the comment in DateField.to_python(). Apparently I was pretty sure of myself at the time but it doesn't make sense to me now, either 🤷

The key thing is that dates always need to be stored in the DB as UTC

Hmm I see. Well I think what is intended then is replace()? From datetime docs. You want to just assume the date is in UTC in DB, ignoring whatever timezone options were applied to the DB. I think this jives with the original comment and the (now-failing) unit test.

If you merely want to attach a time zone object tz to a datetime dt without adjustment of date and time data, use dt.replace(tzinfo=tz).

swheaton commented 2 months ago

@brimoor i think i fixed it for real this time and added an exposing test case by mocking the "local" timezone.