alexanderquispe / BiLab_Summer_Python

Python course for Business Lab-PUCP
MIT License
0 stars 7 forks source link

Group 4 ass 3 2024 abigail_delacruz #146

Closed abigail-delacruz closed 8 months ago

abigail-delacruz commented 8 months ago

import os, pickle , pandas numpy urllib.request, pyreadstat

Imported three SPSS data files (REC0111.sav, RE223132.sav, RE516171.sav) with specific variable and value labels from a given path. Selected specific columns for each dataset and created new dataframes (rec1_1, rec2_1, rec3_1) with updated variable and value labels. Generated a new column named year for rec1_1, updated variable labels, and added a new key-value pair to the var_labels dictionary. Merged rec1_1, rec2_1, and rec3_1 dataframes based on the CASEID, creating a new object named endes_2019. Unified new_var_labels into one object named var_labels and new_value_labels into another object named value_labels. Used them to update attributes in endes_2019. Calculated statistics (min, max, sd, n_obs, n_missing) for specified columns and created a summary dataframe sorted by the number of missing rows. Created a new dataframe (mean_key_vars) with mean values for specific variables grouped by year and department in endes_2019. Reshaped mean_key_vars from wide to long format, resulting in a new dataframe named reshape_mean_key_vars. Replicated tasks 7 and 8 in a single line of code. Merged reshape_mean_key_vars with endes_2019, creating the final dataframe named final_result. .