Support complex numbers

changhc commented 3 months ago

Change Summary

Implement a validator and a serialiser for complex numbers.

As discussed in https://github.com/pydantic/pydantic/issues/8555, since there is no official representation for complex numbers in JSON, I propose to express complex numbers as dictionaries with two keys, real and imag, with floating point values. Please find examples in the newly added test suites in this PR.

Note that function str_as_complex in shared.rs is not yet implemented as I need to discuss with the team if we want to support this. For example, we might want to accept strings like 1+2j as a valid complex number, as long as these strings follow the format described in python's documentation. This implementation, however, will be a bit tricky. Using regular expression is the simplest solution, but it is going to be a bit slow since building regex is rather costly. Parsing strings using the crate num_complex will make the implementation very neat, but num_complex is more tolerant in terms of string format, which means we will need to handle strings not allowed in python and it is going to be problematic. The safest solution is probably to do exactly the same as how cpython parses strings. I think that can be done in a separate PR for easier reviews.

Related issue number

https://github.com/pydantic/pydantic/issues/8555

Checklist

[x] Unit tests for the changes exist
[x] Documentation reflects the changes where applicable
[x] Pydantic tests pass with this pydantic-core (except for expected changes)
[ ] My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 82.12560% with 37 lines in your changes missing coverage. Please review.

Project coverage is 89.55%. Comparing base (ab503cb) to head (8919589). Report is 148 commits behind head on main.

Files	Patch %	Lines
src/serializers/infer.rs	0.00%	15 Missing :warning:
src/serializers/type_serializers/complex.rs	83.87%	10 Missing :warning:
src/input/return_enums.rs	38.46%	8 Missing :warning:
src/input/input_string.rs	80.00%	1 Missing :warning:
src/serializers/ob_type.rs	50.00%	1 Missing :warning:
src/serializers/shared.rs	0.00%	1 Missing :warning:
src/validators/complex.rs	97.67%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1331 +/- ## ========================================== - Coverage 90.21% 89.55% -0.67% ========================================== Files 106 111 +5 Lines 16339 17537 +1198 Branches 36 41 +5 ========================================== + Hits 14740 15705 +965 - Misses 1592 1812 +220 - Partials 7 20 +13 ``` | [Files](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic) | Coverage Δ | | |---|---|---| | [python/pydantic\_core/core\_schema.py](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=python%2Fpydantic_core%2Fcore_schema.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-cHl0aG9uL3B5ZGFudGljX2NvcmUvY29yZV9zY2hlbWEucHk=) | `94.76% <100.00%> (-0.01%)` | :arrow_down: | | [src/errors/types.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Ferrors%2Ftypes.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL2Vycm9ycy90eXBlcy5ycw==) | `99.44% <100.00%> (+<0.01%)` | :arrow_up: | | [src/input/input\_abstract.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Finput%2Finput_abstract.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL2lucHV0L2lucHV0X2Fic3RyYWN0LnJz) | `42.85% <ø> (-27.39%)` | :arrow_down: | | [src/input/input\_json.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Finput%2Finput_json.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL2lucHV0L2lucHV0X2pzb24ucnM=) | `90.50% <100.00%> (+1.57%)` | :arrow_up: | | [src/input/input\_python.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Finput%2Finput_python.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL2lucHV0L2lucHV0X3B5dGhvbi5ycw==) | `97.40% <100.00%> (+0.19%)` | :arrow_up: | | [src/validators/mod.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Fvalidators%2Fmod.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL3ZhbGlkYXRvcnMvbW9kLnJz) | `96.06% <100.00%> (+0.03%)` | :arrow_up: | | [src/input/input\_string.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Finput%2Finput_string.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL2lucHV0L2lucHV0X3N0cmluZy5ycw==) | `47.80% <80.00%> (+9.56%)` | :arrow_up: | | [src/serializers/ob\_type.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Fserializers%2Fob_type.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL3NlcmlhbGl6ZXJzL29iX3R5cGUucnM=) | `82.97% <50.00%> (+0.86%)` | :arrow_up: | | [src/serializers/shared.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Fserializers%2Fshared.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL3NlcmlhbGl6ZXJzL3NoYXJlZC5ycw==) | `78.07% <0.00%> (-1.14%)` | :arrow_down: | | [src/validators/complex.rs](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree&filepath=src%2Fvalidators%2Fcomplex.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic#diff-c3JjL3ZhbGlkYXRvcnMvY29tcGxleC5ycw==) | `97.67% <97.67%> (ø)` | | | ... and [3 more](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic) | | ... and [34 files with indirect coverage changes](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic) ------ [Continue to review full report in Codecov by Sentry](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?dropdown=coverage&src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic). Last update [6e96b85...8919589](https://app.codecov.io/gh/pydantic/pydantic-core/pull/1331?dropdown=coverage&src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pydantic).

codspeed-hq[bot] commented 3 months ago

CodSpeed Performance Report

Merging #1331 will not alter performance

_{Comparing changhc:implement-complex (8919589) with main (863640b)}

Summary

✅ 155 untouched benchmarks

changhc commented 3 months ago

Thanks!

I've looked into the rust regex crate. It has rather limited functionality, specifically lookahead, which means we will need to handle some corner cases ourselves or create some not so straightforward logic. Another crate fancy_regex has better support for regex, but introducing another dependency on a non-native library makes me a bit hesitant unless it's really necessary or beneficial enough. With regex, my biggest concern is whether or not we can capture exactly what is accepted by python. I'm concerned about accidentally accepting/rejecting some strings that are actually rejected/accepted by the complex class in python, which might bring users unexpected problems.

If the team thinks it's okay with regex (and maybe you don't feel like having this cumbersome dictionary representation at all,) I can definitely look into that.

davidhewitt commented 3 months ago

What if we avoided using either a regex or num_complex entirely, and we used the Python logic to build complex instances? This is basically what we do already for Decimal, and it seems like a reasonable design to repeat again IMO.

changhc commented 3 months ago

That sounds good to me. Then I'll work on that. On the other hand, what do you think about the dict representation I proposed and implemented in this PR? Should we still keep it or should we only accept strings like 1+2j?

sydney-runkle commented 3 months ago

That sounds good to me. Then I'll work on that. On the other hand, what do you think about the dict representation I proposed and implemented in this PR? Should we still keep it or should we only accept strings like 1+2j?

We'll chat with the team in our open source sync tomorrow and get back to you :).

sydney-runkle commented 3 months ago

That sounds good to me. Then I'll work on that. On the other hand, what do you think about the dict representation I proposed and implemented in this PR? Should we still keep it or should we only accept strings like 1+2j?

Let's go with the string representation for now, and if the feature lands with some popularity, we can add support for the dictionary style. For now, if folks really need that dict type serialization for complex numbers, they can use a custom serializer.

changhc commented 3 months ago

I've updated the implementation. There is one test case failing with pypy, and I think that's because of some bugs in pypy. How do you usually handle discrepancy between pypy and python? Can we ignore this case for now and simply add a note in the documentation?

Another question regarding test-pydantic-integration. How do I handle this mutual dependency between pydantic and pydantic-core so that this test can pass?

davidhewitt commented 3 months ago

Can we ignore this case for now and simply add a note in the documentation?

Yes, we can xfail this case when running on PyPy. If you're willing to also report the case on the PyPy GitHub, it would help them be aware they need to fix :)

Another question regarding test-pydantic-integration. How do I handle this mutual dependency between pydantic and pydantic-core so that this test can pass?

In this case it won't pass until we update pydantic main for these changes, so once we merge this PR we should make a minor release and integrate them into the main repository.

changhc commented 2 months ago

Hi @davidhewitt, how should we proceed with this? Do we wait for input from the team or should we manage to make a decision here ourselves?

sydney-runkle commented 2 months ago

Hey @changhc,

Sorry we've dropped the ball on feedback here. Will chat with the team early next week so that we can move this forward and include it in our v2.9 release 🚀 !

changhc commented 2 months ago

Thanks @sydney-runkle! I'm making some changes to address David's comments. They will be ready soon!

changhc commented 1 month ago

Hi @davidhewitt, I've implemented the strict mode for complex numbers. As we discussed, when strict mode is on,

only complex objects are accepted for python input.
only complex strings are accepted for JSON input.

Because of the strict mode, I tweaked the messages for validation errors a little bit to make errors less confusing to users. Specifically for python input,

when strict mode is on, tell users that only python complex objects are accepted.
when strict mode is off and the input value is an invalid complex string, give the generic message explaining all acceptable input values instead of telling users to correct the string since at this point we are not sure if users actually want to place a complex string or just give a value of an incorrect type.

For other input types, the error is rather simple as complex strings are the only acceptable input values when strict mode is on.

I also added some test cases for dictionaries involving complex keys just to make sure things work as expected.

sydney-runkle commented 1 month ago

@davidhewitt, are we ready to move this across the line?

davidhewitt commented 1 month ago

Yep, LGTM! Thanks @changhc

changhc commented 1 month ago

Thanks! I'll update https://github.com/pydantic/pydantic/pull/9654 once this PR is merged and the next minor release includes these changes.

sydney-runkle commented 1 month ago

Great! I'll get this merged tomorrow 🚀

sydney-runkle commented 1 month ago

Great work @changhc, going to go ahead and merge this - the failing integration tests should soon be fixed with your branch.

Do we need to make any more updates to https://github.com/pydantic/pydantic/pull/9654 other than supporting a new version of pydantic-core with this change?

pydantic / pydantic-core