fabian-hiller / valibot

The modular and type safe schema library for validating structural data šŸ¤–
https://valibot.dev
MIT License
5.91k stars 180 forks source link

feature request: i18n #36

Closed thundermiracle closed 7 months ago

thundermiracle commented 1 year ago

Thanks for your great library. I think multilingual support is important for this library. For example, our company chose dayjs instead of date-fns because i18n is far more easy to apply Japanese. Would you please consider support i18n?

Add an option i18n to parse function:

parse(input: unknown, info?: ParseInfo, i18n?: Record<string, string>)

And get message from i18n in each function:

throw new ValiError([
  {
    validation: 'ends_with',
    origin: 'value',
    message: error || i18n?.["end_with"] || 'Invalid end',
    input,
    ...info,
  },
]);

So user can only define the i18n for the functions they're using to reduce bundle size. I'm not sure this implementation is OK for this library.

fabian-hiller commented 1 year ago

Great idea and thanks for the API example. For my part, there is nothing against implementing it that way. However, I will wait a bit for feedback from other users before going into implementation.

fabian-hiller commented 1 year ago

Until the feature is implemented, error messages can be displayed in other languages as a workaround as described here: https://valibot.dev/guides/schemas/#error-messages

thundermiracle commented 1 year ago

@fabian-hiller Thanks for your response. I can help implement this simple feature if it's needed. Just notify me.

fabian-hiller commented 1 year ago

Will do. Thank you for the offer.

zkulbeda commented 1 year ago

Maybe add the i18n into info argument object? It's like context.

parse(Schema, input, {
  i18n: {...},
  abortEarly: true,
})

// in schema's parse func
throw new ValiError([
  {
    validation: 'ends_with',
    origin: 'value',
    message: error || info.i18n?.["end_with"] || 'Invalid end',
    input,
    ...info,
  },
]);

And rename i18n to messages

fabian-hiller commented 1 year ago

@zkulbeda yes, that was my first thought too. Thank you for contribution!

samuelstroschein commented 1 year ago

@fabian-hiller We settled on a Record that uses BCP-47 language tags, similar to what @thundermiracle suggested, see https://github.com/inlang/inlang/blob/08ab2f846a1c47455fcb038b67b1f1b7725c5fdf/source-code/core/language-tag/src/schema.ts#L18-L22.

Using a record is the simplest API possible and should be sufficient. I am unsure whether you need to force correct LanguageTag usage like we do, given that validation schemas are defined and used by the same consumer. If valibot consumers are supposed to share validation schemas, then enforcement of BCP-47 language tags as keys of the record would make sense.

fabian-hiller commented 1 year ago

Thanks for your feedback @samuelstroschein! I'll wait another week or two and then do a first implementation.

ivanhofer commented 1 year ago

I really like the approach zod uses. It is really flexible and powerful. You can define a global errorMap (which is a function and not a record) and also pass it to each .parse call if you want to override it.

fabian-hiller commented 1 year ago

@ivanhofer thank you for your feedback! Can you add a code example here to the comments?

thundermiracle commented 1 year ago

Zod's error message is flexible, it can combine parameters with error messages for details. For example, error message of includes is not just Invalid value, but is Invalid input: must include "${issue.validation.includes}".

See: https://github.com/colinhacks/zod/blob/master/src/locales/en.ts#L4-L148

As this is a global function, you can override it by calling setErrorMap.

To make it clearer:

  1. Valibot is pure function base validation library so we can't import global error message settings as Zod does
  2. Zod's error message is more flexible now, but It's not the problem this issue try to fix IMO

To achieve the Zod errorMap way, we should:

  1. Re-design the error message system to not only return simple strings but include parameters as well. This may need errorMap to be a function or may not
  2. Then come back and replace the error message record(object?) with this new errorMap
fabian-hiller commented 1 year ago

Thank you for the detailed info. In the implementation it is important for me that the influence on the bundle size is small, because the ValiError class is used in almost every function. In the best case the bundle size should only change when the i18n feature is used.

gmaxlev commented 1 year ago

I really appreciate the approach of this library, where everything is broken down into small functions. So, why not let error be a function or a string instead of creating a large message registry within the library? This would allow developers to handle localization as they see fit and avoid increasing the size of Valibot, keeping it focused on validation tasks. I assume that if a developer needs internationalization, they can use a separate library for that purpose alongside Valibot.

What I mean:

import { messages } from "/messages"

const LoginSchema = objectAsync({
  password: string((info) => { return messages['password'][info.lang] }),
});

parse(LoginSchema, input, {
  lang: req.lang
})

Instead of req it can be localStorage, React Context or whatever we use.

Also, to make the code more declarative, a developer can write a separate function that returns a function that dynamically returns a message.

import { messages } from "/messages"

function i18n(key: string) {
  return (context) => {
    return messages[key][context.lang]
  }
}

const LoginSchema = objectAsync({
  password: string(i18n('password')),
});

Under the hood, we can check if the error is a normal string or a function

function parseError(defaultMessage: string, error: string | Function, info) {
  if (!error) {
    return defaultMessage
  }
  return typeof error === 'string' ? error : error(info)
}

// ...
throw new ValiError([         
  {
    message: parseError('Invalid end', error, info),
    input,
    ...info,
  },
]);
// ...

In my opinion, this approach allows developers to use internationalization at their discretion and without the restrictions imposed by the library.

fabian-hiller commented 1 year ago

Thank you for this idea. I like the approach. I am curious about the opinion of others.

samuelstroschein commented 1 year ago

@gmaxlev approach is similar to the one we settled on, except that we do i18n statically by defining a Record as a return value that contains the language tags/locales.

From https://github.com/inlang/inlang/blob/b1a20f69ad07a022447527c5cb67088768803599/source-code/plugins/json/src/plugin.ts#L69-L76:

export const plugin: Plugin<PluginOptions> = {
    meta: {
        displayName: { en: "JSON" },
        description: { en: "JSON plugin for inlang" },
},

Using the static API for valibot with the examples that @gmaxlev gave could look like:

const LoginSchema = objectAsync({
  password: string({
    en: "Your password must ....",
    de: "Dein Passwort muss ..."
  }),
});

// using an i18n function that returns a Record is ofc also possible

function i18n(id: string): Record<Language, string> { 
  return // ...
}

const LoginSchema = objectAsync({
  password: string(i18n("password-error")),
});

Pros

Cons

Hmmm, unsure which approach is definitely better. Static definitions of translations seems easier to use for users, more maintainable for valibot, and overall simpler. In any case, if you expose something language related, it seems advisable to expose (enfore) BCP47 language tags to ensure interop across the web platform. Inlang is building a small language tag lib with types and a list of all tags that could be used.

ivanhofer commented 1 year ago

With the current architecture of the independent validation functions the main question is:

How do you know what langauge to render?

On the client this would be easy as there is probably only a single language and you could just set a variable in a global scope and then access it like this:

loadEnglishErrorMessages = () => {
    globalThis.valibot.ERROR_MSG_INVALID_FINITE_NUMBER = 'Invalid finite number'
}

throw new ValiError([
    {
        validation: 'finite',
        origin: 'value',
        message: error || globalThis.valibot.ERROR_MSG_INVALID_FINITE_NUMBER,
        input,
        ...info,
    },
]);

But this won't work in a shared server context as you would constantly override the messages.

This means, you need to pass the language information or the translated error message already to the validation function (the error?: string parameter), which leads to a worse DX.


Another option would be to have a addErrorMessages function that receives an array of ValiErrors and then adds a translated error message if the error was not passed directly to the individual validation function.

With this approach you don't need to pass the language information around (which is a real pain) and instead only need the language information a single time when you want to output the error messages.

import { addErrorMessages } from 'valibot/i18n/de' // built-in for most common languages, or a custom function

try {
   parse(Schema, input)
} catch(errors => {
    const errorWithMessages = addErrorMessages(errors)
})

But this would mean that you will need to load all error messages of all validation functions even if you only use 3 of them.

The errorMap approach of zod would be similar. If you want, you can customize the addErrorMessages function and incluse onle the messages you really use, but probalby nobody will do that as it is a manual and error prone approach.


The third option would be to generate an individual bundle for each language.

import { finite } from 'valibot/de'

Then you could keep the small bundle size, but are limited to the built-in translations (which could be easily extended by opening a PR to the valibot repository).


Summary:

Things get complicated soon if you add i18n ^^ @fabian-hiller let me know if the examples above are clear of if you have some questions.


Just my two cents to the approaches mentioned in some comments above:

@gmaxlev This is a good option without changing the internals that much, but there are 2 downsides with this approach:

lucaschultz commented 1 year ago

Also, to make the code more declarative, a developer can write a separate function that returns a function that dynamically returns a message.

import { messages } from "/messages"

function i18n(key: string) {
  return (context) => {
    return messages[key][context.lang]
  }
}

const LoginSchema = objectAsync({
  password: string(i18n('password')),
});

I quite like the idea. I think this should compose well with our use of i18next. I would suggest adding examples with different i18n libraries in the docs though. It was quite hard to figure out how to use the Zod error maps properly.

gmaxlev commented 1 year ago

In reality, I believe that the current modular architecture of the library imposes certain constraints on how i18n can be implemented.

Due to the modular design of our API, a bundler can use the import statements to remove the code you don't need.

I suppose that if Valibot were to provide a set of standard error messages, it would also make sense to structure them within a modular paradigm. However, it's not entirely clear how, within the current paradigm, we can load only the translations that are truly needed and enable the bundler to avoid including unnecessary ones.

Currently, Valibot consists of small functions that we import. I assume that if Valibot were to offer a set of built-in default messages in different languages, it would become quite inconvenient to constantly import required translations and manage them.

To strike a balance between convenience and modularity, perhaps we should create an object that stores messages in various languages, initially empty

/library/src/messages.ts

const storage = {} // empty

export function addMessage(key: string, locale: string, message: string | Function): void
export function getMessage(key: string, locale: string): string | Function

Now, each imported locale file will simply register a message in the storage without returning anything.

/locale/email/en.ts

import { addMessage } from ā€œ/library/src/messagesā€

addMessage(ā€˜emailā€™, ā€˜enā€™, ā€˜Invalid emailā€™)
// or
addMessage(ā€˜emailā€™, ā€˜enā€™, context => `${context.value} is an incorrect email address`)

export default {}

Furthermore, this approach enables the user to declaratively choose to what extent they are willing to expand the size of their bundle.

// i don't care about bundle size.
import 'valibot/locales';

// i prefer to be mindful of the bundle size.
import 'valibot/locales/en';
import 'valibot/locales/gr';
import 'valibot/locales/email/en';
import 'valibot/locales/email/gr';

Now, error handling should look something like this

import { getMessage } from ā€œ/library/src/messagesā€

function parseError(defaultMessage: string, key: string, error: string | Function, info) {
  if (error) {
     return typeof error === 'string' ? error : error(info)
  }

  const fromStorage = getMessage("email", info.locale)

  if (fromStorage) {
     return typeof fromStorage === 'string' ? fromStorage : fromStorage(info)
  }

  return defaultMessage
}

// ...
throw new ValiError([         
  {
    message: parseError('Invalid email', 'email', error, info),
    input,
    ...info,
  },
]);
// ...

This way, only the imported locales will be added to the storage, allowing users to decide the extent to which they optimize their bundle size.

Furthermore, it might be worth considering incorporating 'addMessage' as part of the API, enabling users to globally override messages. Additionally, providing the ability to use error messages locally would be beneficial if a user wishes to avoid global translations.

Well, these are just rough outlines of what could be done. However, I'm not sure if such a solution fits within the current architecture of the library.

thundermiracle commented 1 year ago

What about accept both string and function when passing error messages?

interface GenerateMessageParams {
  error: string;
  // passing anything needed to context like value
  i18n?: string | (context) => string;
  // better to gather all validation types to one Validation type instead of string
  validation: string;
  default: string;
  value: any;
}
const generateMessage = ({ error, i18n, validation, default, value }: GenerateMessageParams) => {
  if (error) return error;
  if (i18n?.[validation]) {
    return typeof i18n[validation] === "string" ? i18n[validation] : i18n[validation]({value});
  }

  return default;
}

throw new ValiError([
  {
    validation: 'ends_with',
    origin: 'value',
    message: generateMessage({error, i18n: info.i18n, default: 'Invalid end', value}),
    input,
    ...info,
  },
]);

It's quite like @gmaxlev's solution but without defining a global storage which is not preferable in functional programming.

fabian-hiller commented 1 year ago

Thank you very much for your feedback. I will read everything carefully soon and then make a decision and start the implementation.

samuelstroschein commented 1 year ago

We externalized Translatable. Might be usable for valibot without defining a new API.

fabian-hiller commented 1 year ago

Thank you for your contributions to this issue! In summary, the requirements for i18n are as follows:

  1. The default error messages need to be available in more languages
  2. The default error messages must be globally overridable
  3. Specific error messages must be dynamically customizable

When implementing i18n for Valibot, it is important to keep the bundle size to a minimum. For example, if only string and email are needed, it should be possible that the final bundle contains only these translations.

For point 1, I currently find @gmaxlev's proposal to be the best. I especially like the granular import of needed translations. The same procedure would also make it possible to define custom global error messages for the individual functions (point 2).

import 'valibot/i18n';           // Import any translation
import 'valibot/i18n/de';        // Import specific language
import 'valibot/i18n/de/email';  // Import specific message

Point 3 is about specific messages that are assigned selectively. I think it would be best if these could be passed as parameters of the respective functions. Besides a string, which is already supported, it should also be possible to pass a function. This gives you the flexibility to create error messages dynamically. It will also probably allow to connect to common i18n libraries.

const Schema = string([email((info) => ā€¦)]);

The last open question is how to specify the language to use. At the moment I think it is best to be able to specify it optionally via the info parameter of parse and safeParse.

const output = parse(Schema, input, { lang: 'de' });

I look forward to feedback from you all. Especially on the final implementation.

fabian-hiller commented 1 year ago

With v0.17.0 the first step is now done and functions can be passed as parameters besides strings.

fabian-hiller commented 9 months ago

i18n is the next big feature I will be working on in the coming weeks.

CanRau commented 9 months ago

Not sure what the plans are exactly but ParaglideJS's (by Inlang) approach looks promising although still very early days

fabian-hiller commented 9 months ago

Thank you for the hint! I will have a look at it.

samuelstroschein commented 9 months ago

@fabian-hiller you can ping @lorissigrist if you have questions about paraglide js

fabian-hiller commented 9 months ago

Thank you! I've sent him a friend request on Discord.

fabian-hiller commented 7 months ago

I have started implementing our i18n feature. I plan to publish my changes as a draft PR in the next few hours.

samuelstroschein commented 7 months ago

@fabian-hiller are you using paraglide-js ? cc @lorissigrist

fabian-hiller commented 7 months ago

No, but it will allow our users to use Valibot with Paraglide JS for their translations. I am in exchange with Loris on Discord.

fabian-hiller commented 7 months ago

I plan to publish my changes as a draft PR in the next few hours.

I'm still working on it. I'll let you know when the fist draft is ready.

fabian-hiller commented 7 months ago

The draft PR is finally ready: https://github.com/fabian-hiller/valibot/pull/397

fabian-hiller commented 7 months ago

v0.28.0 with i18n is available: https://valibot.dev/guides/internationalization/