lk-geimfari / mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
https://mimesis.name
MIT License
4.43k stars 335 forks source link

Need support of other locales. #3

Closed lk-geimfari closed 7 years ago

lk-geimfari commented 8 years ago

We need to add:

Feel free to send us request for another language which not in list.

You should use: locale_template

lokhi commented 8 years ago

I can help you with french if you want

lk-geimfari commented 8 years ago

@lokhi Of course! If something is missing (in fr folder) , then add, please. If something in data is a wrong then fix this, please. Thank you very much!

lokhi commented 8 years ago

Thanks I will check that ;)

auyer commented 8 years ago

I can help with Brazilian Portuguese !

lk-geimfari commented 8 years ago

@auyer It would be awesome! I added folder for pt-br locale. Please, check out what data we need. Thank you for your contribute!

Hellowlol commented 8 years ago

much of the data from countries can be found on Wikipedia. Like list of cites, names etc. would be nice to use a script to fix this

lk-geimfari commented 8 years ago

@Hellowlol Yeah, i use BeautifulSoup for grabbing some data from another sites. For grabbing from Wikipedia i've used Chrome extension that helped copping a column from a table.

jackmcmorrow commented 8 years ago

Hey @auyer ! I want to help with ptbr too! Hit me up and we can work on it together!

jackmcmorrow commented 8 years ago

Here is a list of brazilian cities with more than 100.000 inhabitans. That's 300+ names in a table.

jackmcmorrow commented 8 years ago

Oops, forgot the link https://pt.m.wikipedia.org/wiki/Lista_de_municípios_do_Brasil_acima_de_cem_mil_habitantes

auyer commented 8 years ago

@jackmcmorrow Sure ! Thanks ! I'll push my changes to my fork. You can make PRs there ! https://github.com/auyer/church

yebrahim commented 8 years ago

Are you looking for RTL languages too?

lk-geimfari commented 8 years ago

@yebrahim Yes, but I don't know how to work with these data. So, if you know how to add data for RTL locale correctly then i support your decision.

theomessin commented 7 years ago

Want to help with Greek!

lk-geimfari commented 7 years ago

@theomessin Perfect! Let me know when you can do it, please. I can add needed folder and json files

theomessin commented 7 years ago

@lk-geimfari If you could just add the folder and json files that'd be great! I'll start as soon as possible 😄

lk-geimfari commented 7 years ago

@theomessin Oh, okay. I add all needed tonight. Σας ευχαριστούμε!

lk-geimfari commented 7 years ago

@theomessin I added folder and files for Greek.

modkaffes commented 7 years ago

@theomessin @lk-geimfari I'd like to help with greek too! I've started work here. @theomessin, want to split files?

lk-geimfari commented 7 years ago

@modkaffes It's great! Thank you!

lk-geimfari commented 7 years ago

When you add a new language, take into account #47

cleac commented 7 years ago

@lk-geimfari hi, I can help with Ukrainian language :smile: Currently have some work to do, but in a week I'll have a plenty of time

lk-geimfari commented 7 years ago

@cleac It's great! Thank you!

redus commented 7 years ago

need help with CJK langs? could provide some help as well

lk-geimfari commented 7 years ago

@redus If you can add all data completely for one locale which you want it would be really awesome! Thank you!

snus-kin commented 7 years ago

I can help with English (GB). I'll give it a look in a while.

lk-geimfari commented 7 years ago

@Uncleleech It would be great. Thank you!

snus-kin commented 7 years ago

@lk-geimfari quick question, does cities include everything (towns, villages) or just cities. I see in the En-Us there is everything but in the UK the definition of city is pretty specific.

Can I just add another list of 'towns' or will this break stuff?

EDIT: Further, what are called 'states' in the US are somewhat equivalent to 'counties' here Not sure what to do here either.

lk-geimfari commented 7 years ago

@Uncleleech Do this, as will be better for UK (add towns, cities and villages to one section city as in other locales)

Regarding the 'states', then they are called "subjects" in Russian, "province" in some other countries. We use the 'states' for all locales in JSON, so it was easier to work with data.

jasonwaiting-dev commented 7 years ago

I can help with Japanese & Chinese. Have a quick question, there are full width & half width characters in Japanese. https://en.wikipedia.org/wiki/Half-width_kana How should I deal with different character sets in elizabeth? I will create full width version first.

lk-geimfari commented 7 years ago

@jasonwaiting-dev You can use ko (Korean) as example. Unfortunately, I never worked with the Chinese and Japanese.

redus commented 7 years ago

@jasonwaiting-dev I was unaware there was a difference between the two in Japanese and have no idea how often it's used.

I think the best way should be just provide the full width version first. If the need arises, make a separate locale like ja-half, unless there is an algorithm switching between the two then we should make it available at specific_providers.py

jasonwaiting-dev commented 7 years ago

@redus it's kind of strange coz some old systems, especially banking systems, in japan are still using half-width characters in database, e.g. アイウエオ (full width) becomes アイウエオ (half width).

I agree that we should just provide full width version at this moment. I am creating jp version now.

redus commented 7 years ago

@jasonwaiting-dev are you making alphanumeric full-width as well? I don't know full-width alphanumeric characters are used or not as well (ABCD0123). I believe simple dictionary from full width char to half width char should do it, without any separate locale.

jasonwaiting-dev commented 7 years ago

@redus sorry for the late reply. Both full-width and half-width alphanumeric characters are being used in Japan. I notice that the usage of half-width alphanumeric characters with full-width japanese characters is more popular now but I can still find a number of webpages using full-width characters only. I think I'll stick to half-width alphanumeric with japanese characters atm.

believe simple dictionary from full width char to half width char should do it, without any separate locale.

yes, we may discuss how to deal with this later. I think we only have this problem in japanese locale.

jlwt90 commented 7 years ago

Hi @lk-geimfari, I can help on Chinese translation. There are a numbers of variations in Chinese language (https://en.wikipedia.org/wiki/Chinese_as_an_official_language).

I think I'll start with People's Republic of China version.

lk-geimfari commented 7 years ago

@jlwt90 That's great! Thank you!

el commented 7 years ago

I made a PR for Turkish #122

PaulWaltersDev commented 7 years ago

Hello All, would you mind if I could create a locale for English (AU)? It would be largely similar to English (GB) except for place and state names, postcode, some business details, a few unique swear words and Aus word spellings.

I'm also French-speaking so can help in that area if you wish.

lk-geimfari commented 7 years ago

@PaulWaltersDev We already have have en-au and fr. Can you check correctness of fr and update en-au?

PaulWaltersDev commented 7 years ago

@lk-geimfari

Regarding en-au, I had a look at elizabeth/data (master branch) and saw only json files for en and en-gb. Neither of them have Australia--specific content. Where can I find it?

I am happy to check en-au and fr. Will let you know when it's done. :-)

lk-geimfari commented 7 years ago

@PaulWaltersDev Oh, sorry i forget remove en-au from .gitignore file. Check it now, please.

PaulWaltersDev commented 7 years ago

@lk-geimfari

No probs. Got the latest version now and made additions to en-au. Will test, commit and push for your perusal.

I'm English so might also have a go at checking en-gb for any errors.

lk-geimfari commented 7 years ago

@PaulWaltersDev Okay. Thank you!

PaulWaltersDev commented 7 years ago

@lk-geimfari Just committed updates to en-au (all except datetime.json which is perfect as it is). Please see commit comments and let me know what you think.

lk-geimfari commented 7 years ago

@PaulWaltersDev Of course, but where i can look at your commits? You have only two repositories and and Elizabeth was not among them.

PaulWaltersDev commented 7 years ago

@lk-geimfari sorry, not had a lot of experience with Github and it shows. You can see the Elizabeth repository and changes now. Submitted a pull request with details included.

uvegla commented 7 years ago

Hi @lk-geimfari, I would like to contribute to the hungarian language. If that is okay I will start with a pull request on some basic missing things like the alphabet, colors etc. then move on to chemical elements and so on.

lk-geimfari commented 7 years ago

@uvegla It would be awesome! Current version of hungarian has been added by myself. I hope that is not too awful. Check it all, please. Thanks!

zelds commented 7 years ago

Hi. Ur u still need help with Ukrainian ?