omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
85 stars 21 forks source link

How to divide individual ancestry background based on UKB datafield 21000? #193

Closed Y-Isaac closed 3 months ago

Y-Isaac commented 4 months ago

Hi,

According to Polypred's manuscript, you have estimated the PRS performance for four types of ethical backgrounds: Non British EUR, South Asian, East Asian, and African, respectively. As for racial background discrimination, it is described in the article as "We used self-reported ancestry based on UK Biobank data field 21000 (Ethic background). We considered Irish ancestry as a non British European ancestry".

Now I would like to refer to this process. I have individual data access permissions in UKB, but I am not sure how to differentiate based on field 21000. Can you provide me with some guidance? Specifically, I have the following four doubts:

1.In your results, Non British EUR includes 41800 people, but 1002 "Irish" and 1003 "Any other white background" only include 30000 people. Which other option should be classified as "Non British EUR"?

2.Does South Asian include Indian, Pakistani, and Bangladeshi?

3.Africa only includes 3000+people in 21000, but your results have 6500+people

4.Does East Asian only include Chinese?

I hope my question won't cause you any trouble. I am very grateful for your help!

Best regards, Issac

omerwe commented 3 months ago

Hi, here's the code I used directly, hope it answers your questions, please let me know if not.

The first stretch of code encodes UKB codes to ancestries, and the second clumps these ancestries into broad ancestry groups.

ethnic_dict[-999] = 'Unknown'
ethnic_dict[1] = 'White'
ethnic_dict[1001] = 'British'
ethnic_dict[1002] = 'Irish'
ethnic_dict[1003] = 'Any other white background'
ethnic_dict[2] = 'Mixed'
ethnic_dict[2001] = 'White and Black Caribbean'
ethnic_dict[2002] = 'White and Black African'
ethnic_dict[2003] = 'White and Asian'
ethnic_dict[2004] = 'Any other mixed background'
ethnic_dict[3] = 'Asian or Asian British'
ethnic_dict[3001] = 'Indian'
ethnic_dict[3002] = 'Pakistani'
ethnic_dict[3003] = 'Bangladeshi'
ethnic_dict[3004] = 'Any other Asian background'
ethnic_dict[4] = 'Black or Black British'
ethnic_dict[4001] = 'Caribbean'
ethnic_dict[4002] = 'African'
ethnic_dict[4003] = 'Any other Black background'
ethnic_dict[5] = 'Chinese'
ethnic_dict[6] = 'Other'

eth_dict['Any other white background'] = 'European'
eth_dict['Indian'] = 'South-Asian'
eth_dict['Pakistani'] = 'South-Asian'
eth_dict['Bangladeshi'] = 'South-Asian'
eth_dict['Any other Asian background'] = 'South-Asian'
eth_dict['Asian or Asian British'] = 'South-Asian'
eth_dict['Caribbean'] = 'African'
eth_dict['Black or Black British'] = 'African'
eth_dict['Any other Black background'] = 'African'
eth_dict['White and Asian'] = 'Mixed'
eth_dict['White and Black African'] = 'Mixed'
eth_dict['White and Black Caribbean'] = 'Mixed'
eth_dict['Any other mixed background'] = 'Mixed'    
eth_dict['Irish'] = 'European'
eth_dict['British'] = 'British'
eth_dict['White'] = 'European'
eth_dict['NS-British'] = 'European'
eth_dict['Chinese'] = 'East-Asian'
Y-Isaac commented 3 months ago

@omerwe Thannnnnnnnnnnnnnks!

Y-Isaac commented 3 months ago

@omerwe sorry to reopen it, what is "NS-British"?

omerwe commented 3 months ago

Hmm I don't remember honestly, looks like I didn't use it eventually (this was ~5 years ago...)

Y-Isaac commented 3 months ago

Hmm I don't remember honestly, looks like I didn't use it eventually (this was ~5 years ago...)

I understand, thank you very much!