Closed Ales-G closed 2 years ago
Yeah this too looks similar to the other error. Anytime you see an error with obsID
involved, it usually has to do with how the IDs are set up. It's on my list to have a data check function be called right before estimating the model to make sure all the inputs are correct. It's hard to know what's causing this without seeing the data.
Btw, you may also want to try using xlogit. It's a python package with a very similar UI to logitr (and it's actually even faster!). It doesn't have WTP space models yet, though that's in the works. It requires the same data structure, so if you use it and don't get any errors, that means there's a bug in logitr. If you get errors, then it's probably an error in the data somewhere.
I think I found out what was wrong. It was my mistake.
I had a few data entry issues. In particular, I had some task where both profiles had choice==0. It is something silly. Maybe it would be worth putting a preliminary error in the function that helps identify similar problems.
But thanks for your helpfulness
Ah okay well glad we figured it out. Yes I have it on my todo list to add more checks to validate the data so that these sorts of errors can be caught more easily. The current error messages you end up getting due to a data error are not helpful for debugging.
Dear Professor Helveston, I am also a big fan of your great logitr code - especially because of the WTP space estimation! Thanks a lot for your work!
I've been working with the code back in the beginning of May and I successfully ran it in preference and WTP space with and without clustered standard errors. Today, I came back working on my analyses and wanted to re-run the exact same code with the exact same dataset as I did in the beginning of May. I just updated the package to your new logitr 0.7.0 version. However, now I receive the same error Ales-G mentioned in this comment: Error in X_chosen[data$obsID, ] : subscript out of bounds
As I said, I haven't changed the dataset or the code since the beginning of May (except for changing the argument name price to scalePar and taking the modelSpace argument out). Even with the simplest model in preference space without clusters, I receive the error. I also checked my dataset again, but I always have four observations per obsID like it should be.
Could it be that anything changed in the code that causes the error since the 0.7.0 update?
I would very much appreciate your help and any ideas! Thanks a lot in advance!
Hi @HeniCha , thanks for your message. Yes, there is a chance I introduced a new bug here with some of the changes I made in the latest version since May. I attempted to make the package more robust by running some checks on the obsID
variable prior to estimating the model to avoid issues like this, but it seems it is still persisting. It is difficult to identify the source of the issue without the data to test against. I have not been able to replicate this issue using the data that comes with the package.
Could you post a portion of the data somewhere and some code here so we can have a reproducible example of the issue? You can keep it as simple as possible, just one or two attributes and only a sample of the data. Just enough so that you can get the error when you run it.
Thanks so much for your quick reply! Please find attached the code:
sample <- read.csv("Merged_long_adj.csv")
attributes <- c("Dlabel2", "Dorigin", "asc")
set.seed(111)
pref_base <- logitr(
data = sample,
outcome = "response",
obsID = "gid",
panelID = "respid",
clusterID = "respid",
numDraws = 1000,
pars = c("price", attributes),
randPars = c(Dlabel2="n", Dorigin="n", asc="n")
)
The example is with clustered se, but even if I take panelID and clusterID out, I receive the error.
Thanks so much!
Okay, I think I found the error in this. I had previously over-written whatever was provided in the data for the obsID
variable as a sequentially increasing series of numbers. So the user could provide really any identifier for the obsID
(even characters) and it would get overwritten. Somehow in adding a few tests for the obsID
variable I lost this line of code. I just added it back in this commit, which fixes this issue.
However, in debugging this I realized that I have yet another bug when computing the standard errors with clustering. This one is a super small bug that I just fixed with this commit.
If you install from github using remotes::install_github("jhelvy/logitr")
, everything should hopefully run smoothly.
Thanks a lot! It now works perfectly and I can successfully run the code again.
Thanks so much for quickly fixing the errors and for your prompt reply!
That's great! @Ales-G any chance this also fixes the issue you were having?
I think this issue is now addressed with recent bug fixes in v0.7.2.
Hello, it is me again! apologies for all these comments but I really like your program and I am using it on a number of datasets, which makes me encounter a number of errors.
I am estimating a basic model, without clustered standard errors, I have a dataset of >3500 observations.
Again I have looked at my
tskID
variable, and it looks correct.Any idea of what may be causing this error? thanks a lot for all of your help and support! you are really making a great contribution to me and to the community in general!