EHWUSF / HS68_2018_Project_1

0 stars 9 forks source link

Information Retrieval Module #13

Closed haleyhowe closed 5 years ago

haleyhowe commented 6 years ago

I propose a program that will give "suggestions" to what type of data analysis the user will be performing. I think this will be useful for general knowledge to help someone get started on data analysis. It will be sort of an outline of what type of analysis will be performed and the steps suggested. For example, it will prompt the user as to what type of dependent variable they are trying to predict (i.e qualitative -> classification, quantitative --> linear regression, unsupervised learning --> clustering methods, etc). Once they determine the type of variable they will predict, the program will list all of the types of analysis they will be able to perform. We can make it sort of a library, based off of what the user specifically wants and will output based off of their conditions. Again, this is more an if/then algorithm that simply returns suggestions with some information about the suggestion. Let me know if this is too broad. We could get good information from the "Intro to Statistical Learning" we used in HS614. This will simply be a program that will list all of the "options" you have for machine learning/predictive analysis.

haleyhowe commented 6 years ago

Also, maybe this could be some sort of open source code, that people would be allowed to enter information on any type of model or analysis to be performed?

hhan14 commented 6 years ago

I like the idea and do think it can be useful to users regardless of their level of data analysis skills. Just for the implementation part, you can approach by grouping or categorizing tools "most applied" for certain types, purpose, and size of data analysis. But I think one of the challenging parts would be how to narrow down those groups and until what depth suggestions can be in detail.

nitieaj commented 6 years ago

it could come in handy when decisions about initial model choices need to be made. Possible use of user input-strings matching conditional function.

kamehta2 commented 6 years ago

This is a good idea. But the project is more about the linear regression model. I don't know if we can add this kind of information model to our project or not

NikitaThomas commented 5 years ago

Since this is a linear regression focused package I think we could still incorporate your idea. I think it could be very helpful to have prompts while working through this package as the user goes through each module explaining the cleaning process and feature selection. It could even give the basic stats of the original data set before cleaning and give a prompt after cleaning including the number of NA values present. This idea could be implemented within each module and make it more of a user friendly package.