Open ST-DDT opened 8 months ago
Sorry I don't totally understand if this is a internal change or would affect consumers.
If I had existing code like
const { faker } = require("@faker-js/faker")
const email = faker.internet.email()
Would that need to change?
Would that need to change?
Old code does not need to change
Could/as proposed:
const { fakerCoreEN, email } = require("@faker-js/faker")
const email = email(fakerCoreEN)
// This is a few KB smaller after bundling than the old code (~620KB? vs ~600KB?)
At full "capacity" (probably not part of the first implementation):
const { email } = require("@faker-js/faker/locale/en")
const email = email()
// This could be tiny (~5KB?)
Provided it doesn't need to change that's fine. Anything which would require consumers to change the syntax of every call to faker they are currently making in existing code would be a big con.
But if this is only for new code, or if you want to take advantage of tree shaking, that's OK.
i think its also important to make sure that tree-shaking actually works after making this change!
This is going to require a LOT of changes to the code; we don't want to spend a long time doing that but then find out that there is some unrelated reason that causes tree-shaking to not be possible with this architecture. Maybe we could start with a proof of concept that this actually allows for tree-shaking to work before committing to changing this, maybe just for a single module or method?
At first: thanks very much for taking the effort and proposing this draft via a GitHub issue π
I already know a little bit more behind the scenes via private conversations and faker meetings, but I also want to write this down here to be more clear and referenceable in the future:
I'm 1_000% with @matthewmayer and want to be able to stick to the current usage of faker and so this proposal / implementation is just an addition / opt-in if wanted to benefit from. Also that means that there will no breaking changes introduced by this implementation and all current tests (except of the deprecation removals for sure) will still work afterwards.
However, I might be fine if we need something as a workaround then to support tree-shaking like import { email } from '@faker-js/faker/tree-shaked'
.
Furthermore, like @matthewmayer I would really love to see a working implementation that results in <<< 600 KiB
like you proposed here, before actually rewriting the whole src code base.
At full "capacity" (probably not part of the first implementation):
const { email } = require("@faker-js/faker/locale/en") const email = email() // This could be tiny (~5KB?)
And one last wish: I would prefer if we could do all the deprecation removals first, so we have more clean test files and also less to convert.
Tracking PR is mostly here (it got slightly more a tracking issue right now π )
Team Decision
Blocked by #2628
const { fakerCoreEN, email } = require("@faker-js/faker") const email = email(fakerCoreEN) // This is a few KB smaller after bundling than the old code (~620KB? vs ~600KB?)
Regarding the new syntax for this what happens to the module names? (Ie "internet" for email?)
I dont have a final idea for this, but I thought about omitting it in the export names unless required for disambiguation.
Assigning the aliases/export names might be a manual task.
It would be absolutely cool if it is somehow possible to call these functions without passing a specific faker instance, and it falls back to a default faker instance. But not sure yet if that's possible.
I would be in favor for sampleString()
because this reads a bit more naturally π€
But I have just put 2 seconds of brain power into that π
It would be absolutely cool if it is somehow possible to call these functions without passing a specific faker instance, and it falls back to a default faker instance. But not sure yet if that's possible.
I thought about exposing all methods bound to a specific locale (as minimal dataset) e.g. from the locale/en folder.
Note: I cut quite a few related/derived proposals from this issue entirely, because it is long enough as is:
- Nano-Localized-Functions for min bundle sizes
I consider that off-topic for the discussion, as the topic is already big enough as is.
It would be absolutely cool if it is somehow possible to call these functions without passing a specific faker instance, and it falls back to a default faker instance.
But not sure yet if that's possible.
I thought about exposing all methods bound to a specific locale (as minimal dataset) e.g. from the locale/en folder.
Oh, I absolutely forgot π€¦, many of the functions do not need a localized faker instance at all and will just work more or less out of the box. So my request/idea was more related to just these non-locale functions. The localized functions can / should depend on a requested faker instance anyway.
Anyway, if something is confusing, I can also just shut up and do vacation if you want π
many of the functions do not need a localized faker instance
But they still need the randomizer from it. We can discuss where to put them, after we have the capability to actually do so.
So here my plan to implement this feature.
FakerCore
the container that stores all mutable state/data.
Is that OK for you?
As this is a big change and will likely need several preparatory pull requests before it gets to the point where you can actually use individual tree-shakable functions, perhaps it would make sense to merge these into a dedicated branch rather than merge directly into next? That way if there are unanticipated problems along the way, there is the freedom to postpone all the changes and release them together in a later release than 9.0.
Clear and concise description of the problem
Currently the faker methods are organized in modules and these are joined to the whole Faker class. Most methods in the modules reference a function in a different module, by invoking it via the embedded Faker reference. This makes it near impossible for building tools to drop unused parts of code package.
Changing the implementation to use standalone methods, that only call other standalone methods, mostly solves this issue. Old code/syntax (e.g. faker.person.firstName) must remain valid.
Suggested solution
1. Extract the functions from the module classes into their own files (the same applies to internal helper methods)
2. Change the signature of the methods, so that it replaces it the reference to the main
Faker
instance e.g. by passing it as argument.fakerCore β> Faker without any modules (only definitions, randomizer, config)
3. Call the standalone functions from the modules.
4. Optionally, "process" the function to make it easier to use with the same/a fakerCore.
Usage:
Usage nano bound standalone module function
````ts // in locales/en.ts const nanoCoreFirstName: FakerCore = { randomizer, config, locale: { person: { first_name }}}; // Contains only the required locale data for firstName export const enFirstName = firstName.withCore(nanoCoreFirstName); // by users enFirstName() // Henry ```` The nano bound standalone module function contain only the data that are needed for that specific function and nothing else. So they are tiny after treeshaking 1-10KB instead of the full 600KB for en.Pros:
firstName.isAvailable = (fakerCore) => fakerCore.rawDefinitions.person.firstName != null;
Cons:
export const
instead ofexport function
.vs
Alternative
I also considered making the
fakerCore
parameter the last parameter and optional by defaulting to e.g.fakerCoreEN
.However that has a few disadvantages:
multiple
to pass the argument on. This basically prevents using derive in combination with pass-through parameters (IIRC currently only supported by unique)Functions with pass-through parameters might result in easier to read code, but that belongs into a different proposal
````ts function multiple(source: (...args: A[], fakerCore) => T, options, fakerCore: FakerCore, ...args: A[]) { // Impossible } multiple(firstName, { count: 5}, fakerCoreEN, 'male'); ````Additional context
This is an attempt at detailing the proposal made in #1250
1250
It was made with a change like #2664 in mind to retain optimal performance.
2664
This proposal will not remove or deprecate the well known
faker.person.firstName
syntax.Adopting this proposal in combination with #2664 has side effects for methods that accept other methods as parameters, as they have to pass on the fakerCore instance.
````ts function multiple(fakerCore: FakerCore, source: (fakerCore: FakerCore) => T, options) { cosnt derived = fakerCore.derive(); return Array.from({length: options.count}, () => source(derived)); } ```` If the user does not use the derived instance, then they use more seeds from their main instance and might even use different fakerCores for the multiple function and the source function. ````ts // BAD multiple(fakerCoreEN, () => firstName(fakerCoreFR), { count: 3 }); // GOOD multiple(fakerCoreEN, (core) => firstName(core), { count: 3 }); // BETTER? multiple(fakerCoreEN, firstName, { count: 3 }); ```` This also affects other functions such as `uniqueArray` and `unique`. Aka that extends the scope of faker to some (helper) functions that call faker functions indirectly.Note: I cut quite a few related/derived proposals from this issue entirely, because it is long enough as is:
- Nano-Localized-Functions for min bundle sizes - Most of the Fakerize Meta-Framework - Code-Transformer - Most of the extension framework - Implementation details - generateComplex(fakerCore, { key: keyof T: (fakerCore) => T[key] }): T - ... I just though about these while writing the ticket, thus it does not imply that I actually plan on adding any of these. (I likely forgot a few as well, because I only wrote them down at the endπ.)Usage Example