dsgelab / finregistry-data

FinRegistry data preprocessing scripts
3 stars 2 forks source link

thl_social_assistance #22

Closed demmlerj closed 1 year ago

demmlerj commented 1 year ago

Data is in csv2 format. File needs English name and processing date stamp.

Quick fix and R code are in findata folder in the respective processed_data folder.

# Author: J. Demmler
# Date: 09/05/2023
# 
# Description:
# manual changes before sending data to THL/Findata
# edits to be implemented in Python pipeline
#
# Problems:
# files were in csv2 format
# file names need changing
# append processing date to file name

data <- read.csv2("3214_FinRegistry_puolisontoitu_MattssonHannele07122020.csv.finreg_IDsp")
write.csv(data, "findata/social_assistance_spouse_provision_2021-06-08.csv", row.names=FALSE)

data <- read.csv2("3214_FinRegistry_toitu_MattssonHannele07122020.csv.finreg_IDsp")
write.csv(data, "findata/social_assistance_provision_2021-06-08.csv", row.names=FALSE)
demmlerj commented 1 year ago

Please also add .feather files! (Still TO DO in the quick fix)

import pandas as pd
import os

os.chdir("/data/processed_data/thl_social_assistance/findata")

df = pd.read_csv("social_assistance_spouse_provision_2021-06-08.csv")
df.to_feather("social_assistance_spouse_provision_2021-06-08.feather")

df = pd.read_csv("social_assistance_provision_2021-06-08.csv")
df.to_feather("social_assistance_provision_2021-06-08.feather")