litestar-org / polyfactory

Simple and powerful factories for mock data generation
https://polyfactory.litestar.dev/
MIT License
1.06k stars 83 forks source link

Enhancement: Sequence field #593

Open Pentusha opened 1 month ago

Pentusha commented 1 month ago

Summary

Hello, I want to take my comment from older closed issue. I miss the ability to generate guaranteed unique values. I used to use the FactoryBoy library for factories and there is a Sequence field for this purpose, which allows you to specify a function that will receive the next integer value at each call. I would like to have a similar field in this library and would like to discuss a possible implementations.

I would like to hear feedback on the proposed solutions and can take on the implementation.

Basic Example

What are the possibilities I see:

@dataclass
class Person:
    email: str
  1. Similar to __random__:

    class PersonFactory(DataclassFactory[Person]):
    @classmethod
    def email(cls) -> str:
        return cls.__sequence__(lambda n: f'user{n}@domain.tld')
  2. Similar to __random__ and Use:

    class PersonFactory(DataclassFactory[Person]):
    email = Use(DataclassFactory.__sequence__, lambda n: f'user{n}@domain.tld')
  3. Similar to FactoryBoy

    class PersonFactory(DataclassFactory[Person]):
    email = Sequence(lambda n: f'user{n}@domain.tld')

Drawbacks and Impact

It will be possible to generate guaranteed unique or sequential values.

Unresolved questions

I would like to hear feedback on the proposed solutions and can take on the implementation. Perhaps the maintainers can suggest why this might be a bad idea or suggest the best way this can be built into the current codebase. It might be worth discussing how the sequence should be reset.


[!NOTE]
While we are open for sponsoring on GitHub Sponsors and OpenCollective, we also utilize Polar.sh to engage in pledge-based sponsorship.

Check out all issues funded or available for funding on our Polar.sh dashboard

  • If you would like to see an issue prioritized, make a pledge towards it!
  • We receive the pledge once the issue is completed & verified
  • This, along with engagement in the community, helps us know which features are a priority to our users.

Fund with Polar

adhtruong commented 4 weeks ago

My initial thought is this can be supported without changing the core factories by inheriting from Use and having a count there. Something like

class Sequence(Use):
    def __init__(self) -> None:
        self.count = 0
        super().__init__(self.next)

    def next(self) -> Any:
        self.count += 1
        return self.count

this would create a count for either instance of sequence. There is no explicit way to reset here but that could be added. Similarly for a callable to transform output.

It might be worth discussing how the sequence should be reset.

Depending on the use case would, globally unique or unique per batch be more aligned with use case? If there is a concept of scopes then it may be a more involved change to Factory itself holds state

Pentusha commented 4 weeks ago

This one works for me like a charm. Thank you. It would be very nice to see it in library alongside with current fields.

class Sequence[T](Use):
    def __init__(self, func: Callable[[int], T]) -> None:
        super().__init__(self.next)
        self.count = 0
        self.func = func

    def next(self) -> T:
        self.count += 1
        return self.func(self.count)

email = Sequence[str](lambda n: f'company{n:04}@test.tld')

It might be worth discussing how the sequence should be reset.

Depending on the use case would, globally unique or unique per batch be more aligned with use case? If there is a concept of scopes then it may be a more involved change to Factory itself holds state

I don't really have that need. Just thinking about how it should work. Maybe it would be useful for someone if the counter was reset for each test, but as I said: I don't need it.

andy-takker commented 1 week ago

This one works for me like a charm. Thank you. It would be very nice to see it in library alongside with current fields.

class Sequence[T](Use):
    def __init__(self, func: Callable[[int], T]) -> None:
        super().__init__(self.next)
        self.count = 0
        self.func = func

    def next(self) -> T:
        self.count += 1
        return self.func(self.count)
email = Sequence[str](lambda n: f'company{n:04}@test.tld')

It might be worth discussing how the sequence should be reset.

Depending on the use case would, globally unique or unique per batch be more aligned with use case? If there is a concept of scopes then it may be a more involved change to Factory itself holds state

I don't really have that need. Just thinking about how it should work. Maybe it would be useful for someone if the counter was reset for each test, but as I said: I don't need it.

Thank you for example!