ModelOriented / modelStudio

📍 Interactive Studio for Explanatory Model Analysis
https://doi.org/10.1007/s10618-023-00924-w
GNU General Public License v3.0
326 stars 32 forks source link

Error in R - Error in eval(predvars, data, env) : object 'Breach' not found #97

Closed ananya231284 closed 3 years ago

ananya231284 commented 3 years ago

I am trying to run a basic multiple linear regression. I have a dataset where I am evaluating the number of data breach incidents across different states in the US. I am using R and trying to use the lm formula after importing the dataset. My goal is to find the summary of my regression model.

My dependent variable is Breach and the others are independent variables. These are the few lines of code I am using:

dat <- read.csv("C:/Users/anany/Documents/dataset_breachcount.csv", header = TRUE, sep = ";", dec = ",") dat = read.table("C:/Users/anany/Documents/dataset_breachcount.csv", header = TRUE, sep = ";", dec = ",") model.breachcount = lm(Breach ~ State_Law + is_MED + is_EDU_GOV + is_BSOR + is_BSF +is_Mal + is_ACC+ Population_Rank + Imp_StateLaw, data = dat) summary(model.breachcount)

I am getting the error : Error in eval(predvars, data, env) : object 'Breach' not found I have used the same lm function in my other regression models and was successful. I am not able to identify what is causing the problem here. I am new to R and may be missing something.

Please find the dataset I am using. I have attached the file in .txt format because GITHUB is not allowing me to attach a .csv file. But I am importing the .csv file in my R program. Is there a way I can upload the .csv [file?] I am attaching a google drive link for .csv file https://drive.google.com/file/d/1ZzbWyQ7xvyeD5bgd3L1GOdsI0F9ULeT4/view?usp=sharing

This is the txt file. datasetbreachcount.txt Also, when I convert the .csv file into .txt file the format is not coming correctly. As I have more than 1700 rows of data it is better if I could import the file in csv format as that is the original format.

hbaniecki commented 3 years ago

I believe you don't import the data properly. The .txt file is separated with \t instead of ;, so after converting .txt to .csv this code worked for me:

dat <- read.csv("datasetbreachcount.csv", header = TRUE, sep = "\t", dec = ",")
model.breachcount = lm(Breach ~ State_Law + is_MED + is_EDU_GOV +
                         is_BSOR + is_BSF +is_Mal + is_ACC+ Population_Rank + Imp_StateLaw, data = dat)
summary(model.breachcount)

You can always use colnames(dat) to ensure that the Breach column is available.

Please, raise issues concerning the modelStudio package and others from the DrWhy.AI family. For problems with R, I would suggest following the R for Data Science book and Stack Overflow.