cid-harvard / py-ecomplexity

Python package to compute economic complexity and associated variables
MIT License
63 stars 24 forks source link

PCI normalization using wrong mean and standard deviation. #25

Closed dauletgali closed 1 year ago

dauletgali commented 1 year ago

Good day, thank you a lot for this project

I came across with a problem that mean and std of the PCI is not 0 and 1. I have looked at the source code and realized that in the function ecomplexity (file ecomplexity.py) line 209,

# Normalize variables as per STATA package
cdata.pci_t = (cdata.pci_t - cdata.eci_t.mean()) / cdata.eci_t.std()
cdata.cog_t = cdata.cog_t / cdata.eci_t.std()
cdata.eci_t = (cdata.eci_t - cdata.eci_t.mean()) / cdata.eci_t.std()

PCI data normalized by the ECI mean and std. Is it mistake or it is done purposefully ? For now my pci ranges from [-10,6] which seems very unusual. Thanks !

dauletgali commented 1 year ago

I think it should be

cdata.pci_t = (cdata.pci_t - cdata.pci_t.mean()) / cdata.pci_t.std()
SultanOrazbayev commented 1 year ago

See this issue in the stata version: https://github.com/cid-harvard/ecomplexity/issues/3 (related issue was closed earlier here https://github.com/cid-harvard/py-ecomplexity/issues/19).

dauletgali commented 1 year ago

@SultanOrazbayev Thank you for reply Рахмет!