jshttp / accepts

Higher-level content negotiation
MIT License
252 stars 42 forks source link

possibility to get region from accept language #23

Open jimmywarting opened 2 years ago

jimmywarting commented 2 years ago

Hi 👋 I'm using express, and i'm trying to figure out how to get the preferred region code (2 letter country code) do you have any recommendations to deal with this?

almost feel like the negotiator dep should have some kind of feature to retrieve it

dougwilson commented 2 years ago

So when a web browser sends "en-us", you are referring to the "us" part, is that right? If you call .languages() it should return the list that the web browser sends. If this is not what you are looking for, I think I need some more clarification about what you're trying to get. Maybe show the http request headers as an example and then note what information from the headers you are looking to get at.

jimmywarting commented 2 years ago

correct. here is currently my default header in chrome that is sent all the time:

accept-language: sv-SE,sv;q=0.9,en-SE;q=0.8,en;q=0.7,en-US;q=0.6

all the bits and pieces are i'm interested in are SE, and US SE is also there twice, it would be neat if i could get a uniq list of regions somehow


Currently I did a workaround using req.languages() but it felt a bit annoying having to map over it, split at - and filter on truthy values and only get the uniq values using new Set and then double check if it matches agains a local list of country codes

// something like:
const prefers = [...new Set(req.acceptsLanguages().map(lang => lang.split('-')[1].toUpperCase()).filter(Boolean))]
const allowed = ['SE', 'NO', 'GB', 'DK', 'US', ...rest_of_all_the_world]
const preferred = prefers.find(pref => allowed.includes(pref)) // SE

it would be neat if the language and the region was kind of two separated methods/things

Would wish for something like this to exist:

req.region(['SE'])
const regions = req.regions() // ['SE', 'US']
const preferred = regions.find(region => allowed.has(region)) // SE

☝️ feature request maybe?

dougwilson commented 2 years ago

Gotcha. I think that should be possible. Note that the second thing after the firsr dash is not always the region and the region can be the numeric code. It looks like from the spec that in order to know the region you need to have the list for certain ambitious situations. So it seems we probably first need to make a module that has all the iso data needed (if there is not already one) and then from there, we can parse the language tag to get the region data 👍

jimmywarting commented 2 years ago

Sounds good 👍

if you don't have a region database/package then here is one coming from: google-libphonenumber if it is to any help

["AC","AD","AE","AF","AG","AI","AL","AM","AO","AR","AS","AT","AU","AW","AX","AZ","BA","BB","BD","BE","BF","BG","BH","BI","BJ","BL","BM","BN","BO","BQ","BR","BS","BT","BW","BY","BZ","CA","CC","CD","CF","CG","CH","CI","CK","CL","CM","CN","CO","CR","CU","CV","CW","CX","CY","CZ","DE","DJ","DK","DM","DO","DZ","EC","EE","EG","EH","ER","ES","ET","FI","FJ","FK","FM","FO","FR","GA","GB","GD","GE","GF","GG","GH","GI","GL","GM","GN","GP","GQ","GR","GT","GU","GW","GY","HK","HN","HR","HT","HU","ID","IE","IL","IM","IN","IO","IQ","IR","IS","IT","JE","JM","JO","JP","KE","KG","KH","KI","KM","KN","KP","KR","KW","KY","KZ","LA","LB","LC","LI","LK","LR","LS","LT","LU","LV","LY","MA","MC","MD","ME","MF","MG","MH","MK","ML","MM","MN","MO","MP","MQ","MR","MS","MT","MU","MV","MW","MX","MY","MZ","NA","NC","NE","NF","NG","NI","NL","NO","NP","NR","NU","NZ","OM","PA","PE","PF","PG","PH","PK","PL","PM","PR","PS","PT","PW","PY","QA","RE","RO","RS","RU","RW","SA","SB","SC","SD","SE","SG","SH","SI","SJ","SK","SL","SM","SN","SO","SR","SS","ST","SV","SX","SY","SZ","TA","TC","TD","TG","TH","TJ","TK","TL","TM","TN","TO","TR","TT","TV","TW","TZ","UA","UG","US","UY","UZ","VA","VC","VE","VG","VI","VN","VU","WF","WS","XK","YE","YT","ZA","ZM","ZW"]

also found this earlier: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry (has 304 region) which also lead to this: https://github.com/mattcg/language-subtag-registry looking at ☝️ it seems to me as if it is also able to figure out the region based on language if it's a country specific language - eg: sv (swedish). it is only spoken in Sweden (ref) so you can be sure if the language in accepted headers is just sv then the region must also be SE - for sweden then i also found this: https://github.com/mattcg/language-tags <- gona start to use this today