Changes for faster runtime for get_cases - 55% speedup

To speedup the process of merging the raw json data with the gsheets data source, I increase the usage of pandas when performing operations that can be vectorized.

The following changes were made to speed up get_cases:

extract_dsph_gsheet_data now returns a pandas dataframe straight.
Standardized targets input, defined in constants.py. This means we can add new ghseets columns by editing only GSHEET_TARGET_COLUMNS
supplement_data now performs the gsheets data replacements in a vectorized way with the loc method. Intersections in case_id are also now detected with sets instead of lists (more efficient),
get_cases is now supplements the data before applying the aliasing. This makes sense so the supplemented data can still be aliased.
All the elements in NONE_ALIAS will now be converted to a numpy.nan instead of just a "none" string. This allows us to utiilize the NaN methods in pandas.
Tests were updated to reflect 1

Note: 1 test fails related to phcovid_network.py. I am still working on figuring out how this happened. Hope to work with @andrewnyu to figure this out.

enzoampil / phcovid

Changes for faster runtime for get_cases - 55% speedup #20