CSVs - Githubissues

amritbhanu commented 6 years ago

@YiqiaoXu-Joe : What do you think which preprocessing steps work?

https://github.com/amritbhanu/EDM591_Hyperparameter/tree/master/csv

amritbhanu commented 6 years ago

Looks like min-max normalization is not a good measure. Other steps doesn't matter much. They are all the same.

YiqiaoXu-Joe commented 6 years ago

While there still a little difference.... I just look at dataset1, it seems mean normalization has a smaller MSE then min-max normalization.

YiqiaoXu-Joe commented 6 years ago

I check dataset 2 and 3, yep..... they are almost the same.

amritbhanu commented 6 years ago

thats what the steps which is using min-max normalization has bad performances. All other steps are good.

amritbhanu commented 6 years ago

and it is only seen for SVM MLs

YiqiaoXu-Joe commented 6 years ago

so is that safe to say mean as normalization methods instead of min-max in our datasets

amritbhanu commented 6 years ago

yes i believe so and it matters the most for SVM

YiqiaoXu-Joe commented 6 years ago

I think also for random forest

amritbhanu commented 6 years ago

do you think we should show it using stats why we chose that preprocessor?

Also, i see precision, recall, accuracy all are getting similar performances. Which one to choose? Precision and recall both I presume?

YiqiaoXu-Joe commented 6 years ago

It's clear to see which method is better based on the performance, I think stats is not necessary. Yeah, I agree with both precision and recall.

amritbhanu commented 6 years ago

can you create charts for them? May be using excel or something. I think 6 charts would be enough. Each chart for each dataset and each measure, x-axis all the preprocessing steps, and 3 learners as 3 different legends. I dont know may be you can reduce more.

Check figure 2 of https://arxiv.org/pdf/1703.00132.pdf. It is nicely presented. Can you look into that?

YiqiaoXu-Joe commented 6 years ago

I'll have a try. I haven't done figure 2 kind of complexed chart.

amritbhanu commented 6 years ago

in matplotlib, you have function called subplots. You can plot 3 figures at 1st row and then 3 figures at the bottom. If you can just see else, i will do it later tonight

YiqiaoXu-Joe commented 6 years ago

ok, I was just thinking excel.

YiqiaoXu-Joe commented 6 years ago

I made some draft figure for dataset 1. Apparently, they have to be changed, but as we only have a bunch of points to compare, do you think the paper figure 2 is a good way for us?

amritbhanu commented 6 years ago

Made the figure - https://github.com/amritbhanu/EDM591_Hyperparameter/blob/master/results/new/graph.png

add this. I will show similar for regression and we will add that. Anything which doesnt look good in the figure or need changes?

amritbhanu commented 6 years ago

joe, where is the paper sharelatex? I dont think i have access to it.

YiqiaoXu-Joe commented 6 years ago

That's weird, I can see you have the access. Here's the link, https://www.sharelatex.com/1741192253stjbfvnfgqst

amritbhanu commented 6 years ago

got it, did you add any figure in the document?

YiqiaoXu-Joe commented 6 years ago

Not yet, I'm writing my other paper draft. I'll try to do that tomorrow, are you available tomorrow sometime?

amritbhanu commented 6 years ago

Ok, I have some work tomorrow. We can catch up on Sunday if that works?

YiqiaoXu-Joe commented 6 years ago

No problem.

amritbhanu / EDM591_Hyperparameter

CSVs #16