softwaresaved / international-survey

International collaboration on RSE survey. Contact: @SimonHettrick
https://www.software.ac.uk/blog/2018-03-12-what-do-we-know-about-rses-results-our-international-surveys
BSD 3-Clause "New" or "Revised" License
24 stars 20 forks source link

Date of Participation breaks in NB 1 #236

Closed StephanJanosch closed 5 years ago

StephanJanosch commented 5 years ago

The data for Date of Participation in

https://github.com/softwaresaved/international-survey/blob/master/analysis/2018/1.%20Overview%20and%20sampling.ipynb

seems missing in public_merged.csv

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/local/homebrew/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2601             try:
-> 2602                 return self._engine.get_loc(key)
   2603             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'startdate. Date started'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-14-9dc95822121d> in <module>
      7         return x
      8 
----> 9 df['Date'] = df['startdate. Date started'].apply(lambda x: convert_time(x))
     10 df_submission_per_country = df[['Country', 'Date']]#.dropna()
     11 total_per_country = df_submission_per_country.groupby(['Country'])['Date'].value_counts().to_frame()

/local/homebrew/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2915             if self.columns.nlevels > 1:
   2916                 return self._getitem_multilevel(key)
-> 2917             indexer = self.columns.get_loc(key)
   2918             if is_integer(indexer):
   2919                 indexer = [indexer]

/local/homebrew/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2602                 return self._engine.get_loc(key)
   2603             except KeyError:
-> 2604                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2605         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2606         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'startdate. Date started'
StephanJanosch commented 5 years ago

In case we marked this data private for Germany, feel free to include in the public data.

Oliph commented 5 years ago

Fixed with fcd5bb0edb90e5545a854f011d54c4c6dd2a0590