XiaoyueYXY / Climate-Compounds

1 stars 0 forks source link

Use correct data sets 2006-2022 #8

Closed valeriehase closed 6 months ago

valeriehase commented 6 months ago

We have two data sets we need to combine: 2_data_preperation_data_old.csv & 1_read_newdata_data_new.csv

old data set (2006-2021, N = 188,291)

  1. the GEC data set:

    • originally 1998-2018, reduced to 2006-2018 for this analysis since this is the year we could resample all outlets via Factiva
    • N = 153,203
    • search terms: "global warming, climate change, greenhouse effect
  2. the Flottes/Festschrift data set:

    • includes new files resampled for 2019-2021
    • N = 35,088 (ONLY new files 2019-2021)
    • search terms: "global warming, climate change, greenhouse effect

--> both "2_data_preperation_data_old.csv" which is read in in step 2 (_2_datapreperation.ipynb)

new data set (2006-2022, N = 36,648)

--> in "1_read_newdata_data_new.csv" which is created in step 1 (_1_readnewdata.ipynb)

valeriehase commented 6 months ago

I replaced the "old" data file in teams (which included data from 1996 also but only texts where "neutral" terms occurred at least twice) with the correct one.

valeriehase commented 6 months ago

Agreed and done!