Open charlax opened 5 years ago
This looks definitely doable, as the code that uses faker is quite contained. Would you be able to take a look at it and submit a patch?
Issue #271 suggests depending on faker more to eventually remove the fuzzy
module from the project. Removing faker goes in the opposite direction.
Maybe faker could provide extra dependencies for less common generators? Roughly, specifying some faker modules like extra dependencies.
Looking at faker modules size, it looks like localization takes a lot of disk space. Perhaps locales could be specified as extra dependencies, so that general-purpose fields remain a requirement (I have FuzzyInteger
, FuzzyFloat
, FuzzyDecimal
, FuzzyDate
, etc. in mind) but locales are extra dependencies. One would then pip install Faker[fr_FR]
to get Faker to generate French data.
60K faker/utils 12K faker/providers/bank/it_IT 12K faker/providers/bank/pl_PL 12K faker/providers/bank/fr_FR 12K faker/providers/bank/nl_NL 12K faker/providers/bank/de_DE 12K faker/providers/bank/en_GB 12K faker/providers/bank/de_AT 12K faker/providers/bank/no_NO 12K faker/providers/bank 8.0K faker/providers/credit_card/en_US 20K faker/providers/credit_card 12K faker/providers/automotive/sv_SE 12K faker/providers/automotive/en_US 12K faker/providers/automotive/ar_JO 12K faker/providers/automotive/pl_PL 12K faker/providers/automotive/pt_BR 12K faker/providers/automotive/en_CA 12K faker/providers/automotive/hu_HU 12K faker/providers/automotive/ar_SA 16K faker/providers/automotive/de_DE 16K faker/providers/automotive/ru_RU 12K faker/providers/automotive/en_GB 12K faker/providers/automotive/en_NZ 12K faker/providers/automotive/id_ID 12K faker/providers/automotive/ar_PS 12K faker/providers/automotive 12K faker/providers/internet/uk_UA 12K faker/providers/internet/sv_SE 12K faker/providers/internet/it_IT 12K faker/providers/internet/en_US 12K faker/providers/internet/fr_CH 12K faker/providers/internet/pt_PT 12K faker/providers/internet/zh_CN 12K faker/providers/internet/el_GR 12K faker/providers/internet/pl_PL 12K faker/providers/internet/fr_FR 12K faker/providers/internet/cs_CZ 12K faker/providers/internet/sl_SI 12K faker/providers/internet/pt_BR 12K faker/providers/internet/fi_FI 12K faker/providers/internet/zh_TW 12K faker/providers/internet/ar_AA 12K faker/providers/internet/hu_HU 12K faker/providers/internet/ja_JP 12K faker/providers/internet/bg_BG 12K faker/providers/internet/en_AU 12K faker/providers/internet/fa_IR 12K faker/providers/internet/bs_BA 12K faker/providers/internet/ko_KR 12K faker/providers/internet/de_DE 12K faker/providers/internet/ru_RU 12K faker/providers/internet/sk_SK 12K faker/providers/internet/en_NZ 12K faker/providers/internet/id_ID 12K faker/providers/internet/hr_HR 12K faker/providers/internet/de_AT 12K faker/providers/internet/no_NO 40K faker/providers/internet 20K faker/providers/job/uk_UA 12K faker/providers/job/en_US 88K faker/providers/job/fr_CH 64K faker/providers/job/zh_CN 20K faker/providers/job/pl_PL 56K faker/providers/job/fr_FR 48K faker/providers/job/pt_BR 20K faker/providers/job/fi_FI 36K faker/providers/job/zh_TW 16K faker/providers/job/ar_AA 44K faker/providers/job/hu_HU 12K faker/providers/job/fa_IR 348K faker/providers/job/bs_BA 40K faker/providers/job/ko_KR 44K faker/providers/job/ru_RU 36K faker/providers/job/hy_AM 12K faker/providers/job/th_TH 28K faker/providers/job/hr_HR 64K faker/providers/job 8.0K faker/providers/file/en_US 24K faker/providers/file 8.0K faker/providers/currency/en_US 28K faker/providers/currency 28K faker/providers/address/uk_UA 24K faker/providers/address/sv_SE 28K faker/providers/address/it_IT 32K faker/providers/address/en_US 28K faker/providers/address/fr_CH 16K faker/providers/address/es 40K faker/providers/address/pt_PT 28K faker/providers/address/zh_CN 380K faker/providers/address/el_GR 24K faker/providers/address/de 40K faker/providers/address/pl_PL 36K faker/providers/address/fr_FR 68K faker/providers/address/cs_CZ 104K faker/providers/address/sl_SI 56K faker/providers/address/pt_BR 44K faker/providers/address/fi_FI 24K faker/providers/address/zh_TW 24K faker/providers/address/en_CA 32K faker/providers/address/hu_HU 48K faker/providers/address/ja_JP 140K faker/providers/address/ka_GE 24K faker/providers/address/en 20K faker/providers/address/es_MX 140K faker/providers/address/nl_BE 64K faker/providers/address/ne_NP 124K faker/providers/address/nl_NL 24K faker/providers/address/en_AU 28K faker/providers/address/fa_IR 16K faker/providers/address/es_ES 40K faker/providers/address/ko_KR 28K faker/providers/address/de_DE 88K faker/providers/address/ru_RU 260K faker/providers/address/sk_SK 56K faker/providers/address/hy_AM 24K faker/providers/address/en_GB 52K faker/providers/address/he_IL 24K faker/providers/address/en_NZ 32K faker/providers/address/id_ID 36K faker/providers/address/hr_HR 24K faker/providers/address/de_AT 12K faker/providers/address/no_NO 28K faker/providers/address/hi_IN 16K faker/providers/address 8.0K faker/providers/user_agent/en_US 20K faker/providers/user_agent 12K faker/providers/phone_number/uk_UA 12K faker/providers/phone_number/sv_SE 12K faker/providers/phone_number/it_IT 12K faker/providers/phone_number/en_US 12K faker/providers/phone_number/fr_CH 12K faker/providers/phone_number/tr_TR 12K faker/providers/phone_number/pt_PT 12K faker/providers/phone_number/zh_CN 12K faker/providers/phone_number/ar_JO 12K faker/providers/phone_number/el_GR 12K faker/providers/phone_number/pl_PL 12K faker/providers/phone_number/fr_FR 12K faker/providers/phone_number/cs_CZ 12K faker/providers/phone_number/sl_SI 12K faker/providers/phone_number/pt_BR 12K faker/providers/phone_number/fi_FI 12K faker/providers/phone_number/zh_TW 12K faker/providers/phone_number/dk_DK 12K faker/providers/phone_number/en_CA 12K faker/providers/phone_number/tw_GH 12K faker/providers/phone_number/hu_HU 12K faker/providers/phone_number/ja_JP 12K faker/providers/phone_number/bg_BG 12K faker/providers/phone_number/es_MX 12K faker/providers/phone_number/nl_BE 12K faker/providers/phone_number/ne_NP 12K faker/providers/phone_number/nl_NL 12K faker/providers/phone_number/en_AU 12K faker/providers/phone_number/fa_IR 12K faker/providers/phone_number/bs_BA 12K faker/providers/phone_number/es_ES 12K faker/providers/phone_number/ko_KR 12K faker/providers/phone_number/de_DE 12K faker/providers/phone_number/ru_RU 12K faker/providers/phone_number/lv_LV 12K faker/providers/phone_number/sk_SK 12K faker/providers/phone_number/hy_AM 12K faker/providers/phone_number/th_TH 24K faker/providers/phone_number/en_GB 12K faker/providers/phone_number/he_IL 12K faker/providers/phone_number/en_NZ 12K faker/providers/phone_number/lt_LT 12K faker/providers/phone_number/id_ID 12K faker/providers/phone_number/hr_HR 12K faker/providers/phone_number/no_NO 12K faker/providers/phone_number/hi_IN 16K faker/providers/phone_number/ar_PS 12K faker/providers/phone_number 8.0K faker/providers/python/en_US 20K faker/providers/python 12K faker/providers/ssn/uk_UA 12K faker/providers/ssn/sv_SE 12K faker/providers/ssn/it_IT 20K faker/providers/ssn/en_US 12K faker/providers/ssn/fr_CH 12K faker/providers/ssn/pt_PT 96K faker/providers/ssn/zh_CN 12K faker/providers/ssn/el_GR 12K faker/providers/ssn/pl_PL 12K faker/providers/ssn/mt_MT 12K faker/providers/ssn/fr_FR 12K faker/providers/ssn/cs_CZ 12K faker/providers/ssn/sl_SI 12K faker/providers/ssn/pt_BR 12K faker/providers/ssn/fi_FI 12K faker/providers/ssn/zh_TW 12K faker/providers/ssn/dk_DK 12K faker/providers/ssn/es_CA 12K faker/providers/ssn/et_EE 12K faker/providers/ssn/lb_LU 12K faker/providers/ssn/en_CA 16K faker/providers/ssn/hu_HU 12K faker/providers/ssn/bg_BG 12K faker/providers/ssn/en_IE 12K faker/providers/ssn/nl_BE 12K faker/providers/ssn/nl_NL 16K faker/providers/ssn/es_ES 12K faker/providers/ssn/ko_KR 12K faker/providers/ssn/de_DE 12K faker/providers/ssn/ro_RO 12K faker/providers/ssn/ru_RU 12K faker/providers/ssn/lv_LV 12K faker/providers/ssn/sk_SK 12K faker/providers/ssn/el_CY 12K faker/providers/ssn/en_GB 12K faker/providers/ssn/he_IL 12K faker/providers/ssn/lt_LT 12K faker/providers/ssn/hr_HR 12K faker/providers/ssn/de_AT 12K faker/providers/ssn/de_CH 12K faker/providers/ssn/no_NO 12K faker/providers/ssn 12K faker/providers/date_time/en_US 12K faker/providers/date_time/ar_EG 12K faker/providers/date_time/pl_PL 12K faker/providers/date_time/fr_FR 12K faker/providers/date_time/sl_SI 100K faker/providers/date_time/ar_AA 12K faker/providers/date_time/hu_HU 12K faker/providers/date_time/ko_KR 12K faker/providers/date_time/ru_RU 12K faker/providers/date_time/hy_AM 12K faker/providers/date_time/id_ID 12K faker/providers/date_time/hr_HR 140K faker/providers/date_time 48K faker/providers/lorem/en_US 24K faker/providers/lorem/zh_CN 28K faker/providers/lorem/el_GR 80K faker/providers/lorem/pl_PL 68K faker/providers/lorem/fr_FR 24K faker/providers/lorem/zh_TW 40K faker/providers/lorem/ar_AA 20K faker/providers/lorem/ja_JP 36K faker/providers/lorem/ru_RU 16K faker/providers/lorem/la 20K faker/providers/lorem/hy_AM 16K faker/providers/lorem/he_IL 20K faker/providers/lorem 12K faker/providers/company/sv_SE 28K faker/providers/company/it_IT 12K faker/providers/company/en_US 12K faker/providers/company/fr_CH 12K faker/providers/company/pt_PT 12K faker/providers/company/zh_CN 16K faker/providers/company/pl_PL 16K faker/providers/company/fr_FR 12K faker/providers/company/cs_CZ 12K faker/providers/company/sl_SI 16K faker/providers/company/pt_BR 12K faker/providers/company/fi_FI 16K faker/providers/company/zh_TW 12K faker/providers/company/hu_HU 12K faker/providers/company/ja_JP 12K faker/providers/company/bg_BG 32K faker/providers/company/es_MX 36K faker/providers/company/nl_NL 104K faker/providers/company/fa_IR 32K faker/providers/company/ko_KR 12K faker/providers/company/de_DE 12K faker/providers/company/ru_RU 12K faker/providers/company/sk_SK 32K faker/providers/company/hy_AM 12K faker/providers/company/id_ID 12K faker/providers/company/hr_HR 12K faker/providers/company/no_NO 36K faker/providers/company 12K faker/providers/isbn/en_US 28K faker/providers/isbn 12K faker/providers/geo/en_US 12K faker/providers/geo/el_GR 12K faker/providers/geo/de_AT 188K faker/providers/geo 8.0K faker/providers/profile/en_US 12K faker/providers/profile 8.0K faker/providers/barcode/en_US 12K faker/providers/barcode 52K faker/providers/person/uk_UA 56K faker/providers/person/sv_SE 24K faker/providers/person/it_IT 152K faker/providers/person/en_US 20K faker/providers/person/fr_CH 68K faker/providers/person/tr_TR 20K faker/providers/person/pt_PT 44K faker/providers/person/zh_CN 156K faker/providers/person/el_GR 164K faker/providers/person/pl_PL 36K faker/providers/person/fr_FR 24K faker/providers/person/cs_CZ 20K faker/providers/person/sl_SI 24K faker/providers/person/pt_BR 76K faker/providers/person/fi_FI 44K faker/providers/person/zh_TW 28K faker/providers/person/dk_DK 12K faker/providers/person/es_CA 60K faker/providers/person/ar_AA 36K faker/providers/person/et_EE 32K faker/providers/person/tw_GH 44K faker/providers/person/hu_HU 24K faker/providers/person/ja_JP 68K faker/providers/person/ka_GE 100K faker/providers/person/bg_BG 92K faker/providers/person/en 44K faker/providers/person/es_MX 12K faker/providers/person/ar_SA 92K faker/providers/person/ne_NP 72K faker/providers/person/nl_NL 24K faker/providers/person/fa_IR 56K faker/providers/person/es_ES 20K faker/providers/person/ko_KR 96K faker/providers/person/de_DE 36K faker/providers/person/ro_RO 80K faker/providers/person/ru_RU 24K faker/providers/person/lv_LV 76K faker/providers/person/hy_AM 72K faker/providers/person/th_TH 20K faker/providers/person/en_TH 56K faker/providers/person/en_GB 120K faker/providers/person/he_IL 96K faker/providers/person/en_NZ 16K faker/providers/person/lt_LT 44K faker/providers/person/id_ID 36K faker/providers/person/hr_HR 16K faker/providers/person/de_AT 84K faker/providers/person/de_CH 24K faker/providers/person/no_NO 20K faker/providers/person/hi_IN 12K faker/providers/person/ar_PS 16K faker/providers/person 36K faker/providers/color/uk_UA 12K faker/providers/color/en_US 24K faker/providers/color/fr_FR 32K faker/providers/color/pt_BR 12K faker/providers/color/hu_HU 16K faker/providers/color/ru_RU 28K faker/providers/color/hy_AM 24K faker/providers/color/hr_HR 24K faker/providers/color/ar_PS 24K faker/providers/color 8.0K faker/providers/misc/en_US 16K faker/providers/misc 40K faker/providers 88K faker 11M total
This is another option, but it's more work and more complicated than just having faker
as an optional dependency that the user has to setup. I don't plan to use faker at all, I prefer simple fuzzy data generation, so I would still not use it...
Fuzzy data generation will be removed at some point in the future. That’s part of the reason why Faker became a required dependency, towards the goal of removing the fuzzy module entirely and replacing it with the more powerful Faker.
That is stated at the top of the fuzzy
module documentation. Making Faker optional is taking a step back from that direction, because it encourages users to rely on the factory.fuzzy
module and not install Faker.
I’m not thrilled by this change and am currently -0
on making it. I would like input from @jeffwidman or @rbarrois before to proceed in that direction.
Accidental closure.
Sorry, what I meant by fuzzy is: writing those fuzzy generators myself.
I understand your point of view. I think there is a lot of value in decoupling factories from "data generators" (fuzzy
module, faker
package) completely so that the user can choose how they want the data generated. Let's say in the future a better package than faker
emerges, then the user should be free to use it. Composing those two unrelated concerns using an interface is just a more powerful design IMO. It would also allow users to customize the faker instance (setting its locale, etc.).
But mainly, 7.6M is just too big a package if some of your users don't use it.
Good point. Making Faker an optional dependency allows users not to installing it when they don’t need it. I imagined that users would instead rely on fuzzy
, but they may simply not need any fuzziness at all.
That works for me :+1:.
I agree that Faker is overkill for my project's needs. 5 different files of English names? Plus every other language? I suspect that only a small percentage of users need more than a predictable 20% of the functionality.
Isn't "Multi-language Lorem" almost a contradiction in terms?
I'll try to sum up the current situation:
pip
and setuptools
there).What is the actual issue here? factory_boy
is intended as a development tool. It's obviously better to reduce the size when possible, but are those 11MB a problem in development platforms?
If we want to reduce the space used by factory_boy, I see 2 options:
Currently, factory_boy has a hard dependency on faker; making that optional would require every user of factory_boy+faker to change from factory_boy
to factory_boy[faker]
in their dependencies... which I'd rather avoid.
Another option would be to publish two releases (factory_boy & factory_boy_minimal)?
All that is quite a lot of work for the project team and possibly for end users.
I would argue that removing faker is a strictly better design, applying the single responsibility principle. Factoryboy already has a lot of value without using faker at all. This would add a lot of flexibility in choosing how fake values are generated.
But sure, this would definitely be a breaking change. I think this is achievable using setuptools extras, and would only require users to change the package they install without changing any code.
On Thu 1 Aug 2019 at 11:57, Raphaël Barrois notifications@github.com wrote:
I'll try to sum up the current situation:
- Faker takes (currently) 11MB of disk space
- For comparison, an empty virtualenv is already 14MB in size (due to the copies of pip and setuptools there).
What is the actual issue here? factory_boy is intended as a development tool. It's obviously better to reduce the size when possible, but are those 11MB a problem in development platforms?
If we want to reduce the space used by factory_boy, I see 2 options:
- Find a way to reduce faker's disk space (for instance by changing the way its provider data is stored, and maybe compressing it; or by making part of the contents optional);
- Provide a way not to install faker alongside factory_boy.
Currently, factory_boy has a hard dependency on faker; making that optional would require every user of factory_boy+faker to change from factory_boy to factory_boy[faker] in their dependencies... which I'd rather avoid. Another option would be to publish two releases (factory_boy & factory_boy_minimal)?
All that is quite a lot of work for the project team and possibly for end users.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FactoryBoy/factory_boy/issues/632?email_source=notifications&email_token=AAA5NNPG6XCYGBKG67I52J3QCKXSDA5CNFSM4H3FWEDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3KBOCA#issuecomment-517216008, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA5NNMVUTIRMVBHW6UQ7PTQCKXSDANCNFSM4H3FWEDA .
I also would strongly prefer to avoid anything like factory_boy
& factory_boy_minimal
or using extras
etc... that's a headache for everyone unless we really need it.
So for me, I see it as either completely drop faker
or stick with it...
The file size is a non-issue IMO... as @rbarrois noted this is a dev library, so having an extra ~11MB is trivial.
I do understand the rationale for single responsibility, and in fact I initially leaned that way myself several years ago. But having used it, I think the developer ergonomics are much better with the dependency included... because otherwise we force those users who want deep integration between Faker
with factory_boy
to rewire all these things together. I myself used to do this before we included Faker
, and it was annoying.
In fact, including faker
is arguably moving toward a single-responsibility because it makes it easier to drop all the fuzzy
stuff that was painful to maintain/extend.
Right now, we are essentially in a "batteries included" + "usage of batteries is optional" world so that those who want to hand-roll their own custom stuff can do that, and those who just want to use some syntactic sugar and not rewire stuff have that option as well. We are not forcing anyone to use these batteries, the only forced thing is the download, which as noted above is trivial for a development-focused library.
So I'm afraid I don't see the point of doing this.
Note for the sake of discussion that not including Faker
and having to rewire all these things together are two different things.
I did a prototype of this change where the fuzzy generators used Faker
if it was available or just threw an exception with a message suggesting to install the dependency if not. Unfortunately, I lost it when I accidentally removed my local factory_boy
clone.
The problem
faker
uses 7.6M of disk as of writing. For users who aren't using its features, this is a pretty heavy cost.It includes a generator for license plates, SSN, ISBN, etc...
Proposed solution
Consider removing the faker dependency, allowing users to plug it if they need it.
This solution would also allow users to control how the faker factory is used and specify its locale, for instance.