fabian-hiller / valibot

The modular and type safe schema library for validating structural data 🤖
https://valibot.dev
MIT License
6.33k stars 204 forks source link

[WIP] Add `uniqueItems` action #870

Open nix6839 opened 1 month ago

nix6839 commented 1 month ago

I have some questions.

  1. When I have an array like [5, 6, 5, 5], would it be better to call _addIssue once and add all the duplicate issues related to 5 into the path array at once, or to call _addIssue multiple times to add them individually?"

  2. It seems that uniqueItems in JSON compares objects through deep equality. Should I follow JSON's behavior, or would simple comparison (===) be sufficient?

Resolves #866

fabian-hiller commented 1 month ago

Without much thought I would use dataset.value.length !== new Set(dataset.value).size to check for duplicates and return a single issue. But it is true that this is not an easy decision. We basically have to decide whether we want one issue for the whole array or one issue for each duplicate item. We also have to decide whether to check for shallow equality or deep equality. For example, shallow equality will not work with object because the library parses the input and therefore copies the data into a a new object when the schema is executed. What do you think is best?

nix6839 commented 1 month ago

We also have to decide whether to check for shallow equality or deep equality. For example, shallow equality will not work with object because the library parses the input and therefore copies the data into a a new object when the schema is executed.

I think it would be best to use deep equality to ensure it operates the same as JSON schema. However, I also believe there will be users who prefer shallow equality. To accommodate those users, how about using deep equality by default but allowing a customizable function in the form of (a, b) => boolean? I am a bit concerned about the potential performance issues with deep equality. What are your thoughts on this?

We basically have to decide whether we want one issue for the whole array or one issue for each duplicate item.

As for the issue handling, I think we should first agree on and implement the functionality above, and then look into how we can handle the issue format afterwards.

fabian-hiller commented 1 month ago

How about we provide a uniqueItems action for a small bundle size and better performance and a deepUniqueItems action that uses a nested loop that deeply compares objects, arrays, maps, sets and dates?

nix6839 commented 1 month ago

I think that's a good idea. I'll try to implement uniqueItems first tomorrow.

fabian-hiller commented 1 month ago

Would you return an issue for every duplicated item or just one for the entire array? This decision can affect the implementation.

nix6839 commented 1 month ago

Since checkItems (which uses a similar naming pattern with -Items instead of -Item) operates by returning an issue for every duplicated item, how about making uniqueItems behave the same way for consistency?

nix6839 commented 1 month ago

I've committed it for now, but feel free to share any other opinions. If it's finalized, I'll start working on updating the documentation.

fabian-hiller commented 1 month ago

Thank you very much! I will try to give you feedback at the end of next week as I have to focus on other things in the next few days.

fabian-hiller commented 3 weeks ago

I just want to let you know that I am focusing on our v1 release first before reviewing this PR.