epogrebnyak / cbr-db

Russian banking sector statistics tools (ETL)
1 stars 3 forks source link

Move to sqlite + pandas #62

Open epogrebnyak opened 8 years ago

epogrebnyak commented 8 years ago

Need discuss follwoing plan:

  1. Store all form 101 form data for selected dates to sqlite database, same cli commands as before:
    • download
    • unpack
    • make csv
    • import csv
  2. Retrieve pandas dataframe with contos (columns) by dates (index)
    • for selected bank (regn) regn_101 = get_101_dataframe(regn), where regn is interger
    • for a list of regns regn_101 = get_101_dataframe(regns), where regns is a list of intergers
  3. Make balance dataframe for selected bank or group of banks (regn):
  4. Need tests
epogrebnyak commented 8 years ago

top-30 group composition:

top_30_regns = [2590, 1326, 436, 2748, 2289, 30, 2562, 1439, 1460, 1000,
1623, 354, 323, 912, 1978, 3016, 3251, 3292, 2272, 3349, 328, 1481, 1470,
2557, 963, 429, 2209, 1971, 316, 1]

(may be incomplete due to banks M&A)

alexanderlukanin13 commented 8 years ago

@epogrebnyak

  1. OK
  2. Please provide output dataframe example (what columns it will contain?)
  3. Please provide output dataframe example (what columns it will contain?)
  4. Do you want live tests for this (connecting to real sites, getting real data)?