TrialsOnTrails / Group-8-Assignment

1 stars 0 forks source link

try random forest #8

Open TrialsOnTrails opened 3 months ago

MATTHEWHAHA commented 3 months ago

ok

TrialsOnTrails commented 3 months ago

please fill in your code using functions and run it in the main functions. I have already put it in the rmd file

MATTHEWHAHA commented 3 months ago

Hi, you are so "ging", know a lot of models, seems prof haven't taught😂😂 I read his coding...

MATTHEWHAHA commented 3 months ago

I am doing the random forest part, but my coding has some error, need your help or guidance, may I know, when you free? We shall did one more model?

MATTHEWHAHA commented 3 months ago

I try to run the following coding in R: library(randomForest) library(readr) install.packages("rmarkdown") install.packages("quanteda") library(quanteda) install.packages("data.table") library(data.table) reviews <- fread("/Desktop/blogData_train.csv") Datatest=fread("/Desktop/blogData_test.csv") Datatrain=fread("~/Desktop/blogData_train duplicate removed.csv")

assume last column 'C281' is the target variable"

target_column='C281' Datatrain$target_column<- as.factor(Datatrain$target_column) model=randomForest(target_column,data=Datatrain)

Then

Error in if (n == 0) stop("data (x) has 0 rows") : argument is of length zero

TrialsOnTrails commented 3 months ago
  1. Use the Rmarkdown file for coding. I saw you updated the HTML file - that one is just nothing but an outcome presentation.
  2. To use the Rmarkdown file, download the entire zip file to your local environment, then set your working directory accordingly, including changing the path of data loading, etc.
  3. I have built most of the functions you might need to use in the r markdown file. Read the documentation and fill your code in the right place.
  4. Please contribute to the assignment intelligently. Use ChatGPT or Google to debug or learn by yourself first before asking help from me. Get your hands dirty. Read online R documentation to learn how to use a function or how to write an argument in R.
  5. You should train a model that can run, output the prediction in a correct format, submit it at least once to Kaggle, then improve the model again. Finally you should update everything formatted with easy-to-read documentation notes in Rmarkdown.
  6. All the models I tried had been taught in the lectures or in the previous courses. You should have learned them already if you did not skip 5 classes in a row. Well, right now you have one more chance to learn something.
  7. I will take over and work on this issue by the end of this Saturday if you hardly make a satisfactory contribution.
TrialsOnTrails commented 3 months ago

By saying "satisfactory", it means (1) formatted coding and easy to read documentation notes in the r markdown file; (2) the MSE results in Kaggle should be no worse than 589 (this number is generated by the most simple decision tree, so a random forest should not worse than that)

TrialsOnTrails commented 3 months ago

I reserve a place in the Rmd for you to put your code. See the content at the beginning of the R markdown file, you will find "3.3.8 Random Forest".

MATTHEWHAHA commented 3 months ago

hi, can u help me double check? thanks

TrialsOnTrails commented 3 months ago

I ran your code but it seems there are still some bugs. Please put some effort into debugging the code.

Please also try to improve the performance of the output

MATTHEWHAHA commented 3 months ago

Yup, are u free tonight? have a chat before due? We just submit the mark down...

MATTHEWHAHA commented 3 months ago

feel strange, my model has only 115 comments, then I run the Naive Bayes Gaussian model part, it is also the same..