f4bD3v / humanitas

A price prediction toolset for developing countries
BSD 3-Clause "New" or "Revised" License
17 stars 7 forks source link

Situation of retail daily dataset #23

Closed halccw closed 10 years ago

halccw commented 10 years ago

This is a fairly consistent dataset.

  1. There is no subproduct for each product
  2. Products have very similar distribution of NaN with each other
  3. All products have about 60% of valid data
  4. The following 9 products have "fairly good" data for all 15 regions: Atta(Wheat), Gram Dal, Onion, Rice, Salt Pack, Sugar, Tea Loose, Tur, Vanaspati

https://github.com/fabbrix/humanitas/blob/master/analysis/statistics/india-daily-retail/num_cities_0.4.csv https://github.com/fabbrix/humanitas/blob/master/analysis/statistics/india-daily-retail/best_non_na_0.4.csv

halccw commented 10 years ago

Note that this dataset only contains data from 2009 to 2013.