CIRAIG / OpenIO-Canada

Module to create symmetric Environmentally Extended Input-Output tables for Canada.
20 stars 5 forks source link

How I can call and use this function. Please help. thanks #22

Closed rehananis closed 1 year ago

rehananis commented 1 year ago

def treatment_import_data(original_file_path): """Function used to treat the merchandise imports trade database file. FIle is way too big to be provided to users through Github, so we treat the data to only keep what is relevant."""

# load database
merchandise_database = pd.read_csv(original_file_path)
# drop useless columns

merchandise_database = merchandise_database.drop(['YearMonth/AnnéeMois', 'Province', 'State/État',
                                                  'Quantity/Quantité', 'Unit of Measure/Unité de Mesure'],
                                                 axis=1)

# drop international imports coming from Canada
merchandise_database = merchandise_database[merchandise_database['Country/Pays'] != 'CA']

# also drop nan countries for obvious reasons
merchandise_database = merchandise_database.dropna(subset=['Country/Pays'])

# set the index as country/code multi-index
merchandise_database = merchandise_database.set_index(['Country/Pays', 'HS6'])

# regroup data from several months into a single yearly data
merchandise_database = merchandise_database.groupby(merchandise_database.index).sum()

# multi-index is cleaner
merchandise_database.index = pd.MultiIndex.from_tuples(merchandise_database.index)

return merchandise_database
MaximeAgez commented 1 year ago

You don't have to use this function, it's a function that I used to treat imports data from the merchandise trade database to produce the files that are provided in the Imports_data folder. If you still want to rerun this function yourself to extract other data, you can copy paste the function and make it run in your Jupyter notebook or Python kernel. You will then have to provide the path to the complete merchandise trade database file.

rehananis commented 1 year ago

Thanks