mini-kep / datamap

Missing values datamap
0 stars 1 forks source link

simplify dataframe access - use api/frame endpoint #11

Open epogrebnyak opened 7 years ago

epogrebnyak commented 7 years ago

Use

https://github.com/mini-kep/db/blob/4258cf2b69e77bc68d1ff009976ea1aadde467ca/integration/access2.py#L97-L100

or

https://github.com/mini-kep/db/blob/4258cf2b69e77bc68d1ff009976ea1aadde467ca/integration/access.py#L77-L79

to simplify data access.

Todo:

  1. write small access code with functions above and get rid of dependecies on viz.py and query_all.py
  2. use access code in https://github.com/mini-kep/datamap/blob/master/minikep_missing_values.ipynb
  3. update graphs
epogrebnyak commented 7 years ago

on a recent commit, in return pd.read_csv(url, converters={0: pd.to_datetime}, index_col=0, squeeze=True) for gwttin datafarame, squeeze does not seem approprate, as we are reading a df

epogrebnyak commented 7 years ago

by @zarak: The missingno library expects the time indices to be in ascending order, otherwise it doesn't display the index. See the line here:

ts_array = pd.date_range(df.index.date[0], df.index.date[-1],
                                     freq=freq).values

Also, I wasn't able to set ticks for the daily frequency, as it was raising this exception.

except KeyError:
    raise KeyError("Could not divide time index into desired frequency.")
zarak commented 7 years ago

Here is the full traceback for the KeyError:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2441             try:
-> 2442                 return self._engine.get_loc(key)
   2443             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 548726400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
   1426         try:
-> 1427             return Index.get_loc(self, key, method, tolerance)
   1428         except (KeyError, ValueError, TypeError):

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2443             except KeyError:
-> 2444                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2445 

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 548726400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 548726400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2441             try:
-> 2442                 return self._engine.get_loc(key)
   2443             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('1987-05-23 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 548726400000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
   1435                 stamp = Timestamp(key, tz=self.tz)
-> 1436                 return Index.get_loc(self, stamp, method, tolerance)
   1437             except (KeyError, ValueError):

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2443             except KeyError:
-> 2444                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2445 

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('1987-05-23 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/missingno/missingno.py in matrix(df, filter, n, p, sort, figsize, width_ratios, color, fontsize, labels, sparkline, inline, freq)
    224             for value in ts_array:
--> 225                 ts_list.append(df.index.get_loc(value))
    226         except KeyError:

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
   1437             except (KeyError, ValueError):
-> 1438                 raise KeyError(key)
   1439 

KeyError: numpy.datetime64('1987-05-23T00:00:00.000000000')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-29-00d4247cbbd7> in <module>()
----> 1 msno.matrix(daily.drop(['USDRUR_CB', 'BRENT', 'UST_30YEAR', 'UST_20YEAR', 'UST_1MONTH'], axis=1), freq='D')

~/anaconda3/lib/python3.6/site-packages/missingno/missingno.py in matrix(df, filter, n, p, sort, figsize, width_ratios, color, fontsize, labels, sparkline, inline, freq)
    225                 ts_list.append(df.index.get_loc(value))
    226         except KeyError:
--> 227             raise KeyError("Could not divide time index into desired frequency.")
    228 
    229         ax0.set_yticks(ts_list)

KeyError: 'Could not divide time index into desired frequency.'