cxcxxin / urap_tech_sp16

issue tracking
1 stars 1 forks source link

New mission on bundle_shares comparasion #39

Open ZhaotangLuo opened 8 years ago

ZhaotangLuo commented 8 years ago

Define rel_diff = abs_diff / sales_share, where sale_share is bundle-sales share in transaction record dataset or review dataset. Use total sales of each item to weight relative bundle-sales difference, i.e, replace abs_diff by rel_diff in the following formula. image

Also do it for 700D

cxcxxin commented 8 years ago

@Quinn126 please work with @ZhaotangLuo to make sure that his coding on this project is accurate and transparent and make sure you understand the story @ZhaotangLuo is telling

there should be a script pipelineing all the codes if there are multiple pieces @Quinn126 please document here the code associated with the project, ie location in dropbox and the graph for each slide should match exactly the title in the code

cxcxxin commented 8 years ago

@ZhaotangLuo compute the fraction of reviews without bundle numbers and report results here you should also carefully document the comparison results in slides for both 700d and 750d

ZhaotangLuo commented 8 years ago

In the dataset reviews_750d_0205_withpage.txt (urap_programming\all_data\data_bazhuayu\accumulative_review\reviews_750d_0205_withpage.txt; I clean this original dataset by dropping duplicated rows and items that are NOT 750D),
542 out of 5868 (9.2%) reviews are without bundle indices.

All 542 reviews are from Taobao, out of which 449 are automatically-given reviews. The rest 93 reviews are from a seller (id = 36435403857) with no bundle indices available on its website.

Quinn126 commented 8 years ago

@cxcxxin @suyanglu @ZhaotangLuo Where are the code that I'm supposed to document located?