jrnd-io / jr

JR: streaming quality random data from the command line
https://jrnd.io
MIT License
130 stars 26 forks source link

docs: Template localizations? #104

Open al-x opened 1 year ago

al-x commented 1 year ago

Can you explain how to localize the templates?

I read your blog post which seems relevant, but I'm afraid I couldn't figure out how to implement such a template localization.

For example, the user template defaults to the address style:

  "address": "{{city}}, {{street}} {{building 2}}, {{zip}}",

but I would like to use the United States address style of:

  "address": "{{building 5}} {{street}}, {{city}}, {{state}} {{zip}}",

I would contribute address localizations for US, MX, and CA if I only knew how.

ugol commented 1 year ago

sorry Alex I missed this, I'll be back to jr this week, needed to setup a new dev machine and some other stuff. US is the default localisation for jr, so it's not needed to do anything special. I have seen (and approved) your commits, I generated some stuff with chatGPT so there are some small errors here and there. Question: I see you moved Philadelphia, did you check the corresponding zip codes (must be in the same position in the file). Regarding the address style, let me check what can be done

al-x commented 1 year ago

I did not know the rank of the zip codes should be synchronized with the rank of the cities. I'll review the 2 files and attempt to order the zip codes to correspond with the city order (alphabetical).

ugol commented 1 year ago

for MX and CA, essentially you need to create a dir in templates/data. you can start copying for example the uk one in mx and ca new dirs and start from there: the files should be self explanatory, the only gotcha is that city, zip and phone must match, i.e. a city must have the corresponding zip and phone regex in the same position

al-x commented 1 year ago

The first thing I notice is that there are 4 more zip codes than there are cities. How does that work with position/rank matching?

ugol commented 1 year ago

mmm, yep, they are definitely wrong then. I did the italian localisation and it should be 99.99% right, the others are AI generated and can't be trusted atm. Not easy to get the right zip/phone pattern for every city in the world: having the same lines in the file it's the bare minimum :)

al-x commented 1 year ago

I'm about halfway through looking up the zip codes, and the organization appears to be one zip code per state capital, sorted by the alphabetic order of the full name of the state (and not the 2-letter postal abbreviation of the state - good work US Postal Service in not making those 2 sets homomorphic...). So that makes sense to me.

Still checking for data quality.

al-x commented 1 year ago

OK, I've confirmed that the 50 zip codes entries in the zip file correspond one-to-one with the state capitals, and that the zip codes in that file are currently ordered alphabetically by the full (long) state names. I'm not sure if that was your intent, and that is out of sync with the current state of thecity file, which has different cities (count 46 instead of 50) and only 8 cities overlapping (ie. state capitals in the city file).

Let me know what you want to do and I can adjust the files to match your intention.

al-x commented 1 year ago

BTW, city looks like the 46 largest population US cities, in alphabetical order.

al-x commented 1 year ago

It's probably waaay overkill for this project, but the zip-lookup JS project organized the full 2015 US Zip Code DB (including all 50 states and 5 US territories) in 100 8kb files.

al-x commented 1 year ago

The order of state_short matches state file's alphabetical order, so you've already dealt with that issue.

al-x commented 1 year ago

The simplest fix would be to replace the populous cities in city with the state capital cities.

ugol commented 1 year ago

yep, but the "right" cities to use for a rnd generator are the others, the most populous ones. So zips should be updated with those, and cities should probably contain the capitals too