I have a 12GB .tgz file. Inside of that file, there are .csv.gz files.
I want to use this data for machine learning to classify user category.
Before I jump into this big file, I wanted to train only one .csv file inside of this zip. (for learning) 108MB file and it has something like this data >
The output of machine learning prediction will be a number that represents the category of user.
Which ML algorithm do you suggest to me? But I am not sure how I should proceed.
I learned SVM, Naive Bayes, KNN, Decision Tree before but the datasets were easy.
Like only two output > Cancer(1) or not cancer(0)
For this kind of dataset, how should I approach it?
Hello guys.
I have a 12GB .tgz file. Inside of that file, there are .csv.gz files. I want to use this data for machine learning to classify user category. Before I jump into this big file, I wanted to train only one .csv file inside of this zip. (for learning) 108MB file and it has something like this data >
The output of machine learning prediction will be a number that represents the category of user. Which ML algorithm do you suggest to me? But I am not sure how I should proceed.
I learned SVM, Naive Bayes, KNN, Decision Tree before but the datasets were easy. Like only two output > Cancer(1) or not cancer(0) For this kind of dataset, how should I approach it?
Thanks.