ceumicrodata / economic-diplomacy

0 stars 0 forks source link

Create bias corrected KLD (p values) #24

Closed korenmiklos closed 3 years ago

korenmiklos commented 4 years ago

I did a bit of math with KLD and the multinomial. A better measure of distance between {p} (the base distribution) and {x} (the actual distribution) could be the log likelihood function. It is similar to KLD, but has several correction terms: $$ \log L = \ln n! -\sum_k\ln x_k! + \sum_k x_k\ln p_k $$ @zaveczgergo please compute the logL divided by n in the simulated data and plot it agains log n to see if there is a size bias in this measure.

korenmiklos commented 4 years ago

Actually, let me give it a try.

zaveczgergo commented 4 years ago

Is there a guide somewhere to interpret markdown formulas? I have not found one so far, thus I am a bit unsure about the formula in the comment. Also I am a bit unsure about the notations, do I understand correctly that p is share and x is sample_share in this case? n is still the number of balls I guess.

zaveczgergo commented 4 years ago

Oh, ok, thanks, then I will have a look at it, to better understand the formula.

korenmiklos commented 4 years ago

@zaveczgergo Please save the data for this simulation in a .csv file.

partner_country,year,shipments1,shipments2,...,shipments97
RU,2017,250,120,...,1

Shipments should be integers. For each (d,o,t), first round up shipments, then sum across origin countries (in this order):

generate shipments = ceil(trade_volume / shipment_size)
collapse (sum) shipments, by(partner_country year product)
reshape wide shipments, i(partner_country year) j(product)

I can estimate the Dirichlet MN from this.