issues
search
Anteemony
/
sktime-storesales-forecasting
0
stars
0
forks
source link
More Detailed EDA
#2
Open
Anteemony
opened
4 months ago
Anteemony
commented
4 months ago
more detailed EDA
[ ] #5
[ ] #6
[ ] #7
[ ] #8
action: formulate some basic questions and answer them
write down questions at the start, what are you investigating?
e.g., promotions impact sales?
ability to forecast sales?
up to you, but ensure you have these at the start
think about ways to address these
ensure to have "basic" EDA elements
report size of data, type of data
report summary statistics
e.g., how many shops are there
how do shops distribute by sales
univariate plots of summaries by shop
interesting summary - average sale, type of good
correlation analyses and plots for things like
shops vs sales
good type vs sales
sales per time unit by shop or good type
promotions
put the above in markdown cells (observations) and python cells
fkiraly
commented
4 months ago
Adding my thoughts here, based on the current notebook:
basic parameters of the data set should be summarized at the start
basic summaries should also be at the start:
how many stores? how many product groups?
coverage in terms of start/end of record per store
summary of sales by store, do statistical summaries on the summaries. E.g., average sales in a year, variation, etc.
there are marked "jumps" in the mean aggregate over stores. What explains these? Could be artefact start/end
similar analysis for product groups
interaction analysis product groups and stores. Which stores sell which products? How does the product compositiòn distribute?