IBMPredictiveAnalytics / R_Essentials_Statistics

Download R Essentials required for SPSS Statistics
GNU General Public License v2.0
43 stars 22 forks source link

Zero inflated extension in SPSS 25 on mac not working #11

Open marlone-henderson opened 4 years ago

marlone-henderson commented 4 years ago

Hello. I have SPSS 25 on Mac and added R 3.3 extension, so that I could add the zero inflated extension. Both seem to be successfully added into my SPSS menu. However, when I run the zero inflated negative binomial analyses, nothing happens. There's a brief moment when I see "zero inflated" flashed at the bottom where "IBM Statistics Processor is ready" is displayed, but then no output is created.

Please advice.

JKPeck commented 4 years ago

Check to see that the R Essentials is working and that the R library used by this command is installed. Run this from a syntax window. begin program r. print(sessionInfo()) library(pscl) end program.

marlone-henderson commented 4 years ago

Hello. Thank you for replying to my question.

I ran the syntax command you suggested and nothing happened.

On Sat, Apr 18, 2020 at 9:08 AM Jon Peck notifications@github.com wrote:

Check to see that the R Essentials is working and that the R library used by this command is installed. Run this from a syntax window. begin program r. print(sessionInfo()) library(pscl) end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615867682, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQKEHX2T3DDHJXY54ULRNGQ57ANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

Hi. Actually, I think the issue may have to do with my configuration. I found these instructions but I don't see this path on my computer. I don't see a Frameworks folder

On Sat, Apr 18, 2020 at 8:11 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Hello. Thank you for replying to my question.

I ran the syntax command you suggested and nothing happened.

On Sat, Apr 18, 2020 at 9:08 AM Jon Peck notifications@github.com wrote:

Check to see that the R Essentials is working and that the R library used by this command is installed. Run this from a syntax window. begin program r. print(sessionInfo()) library(pscl) end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615867682, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQKEHX2T3DDHJXY54ULRNGQ57ANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

Your R Essentials appears not to be working. Try installing the Essentials starting Statistics as Admin. If that doesn't work, make sure that you have installed fixpack 2 for V25.

marlone-henderson commented 4 years ago

Hi. Okay, I just installed the fixpack 2 for 25. What do you recommend that I do next?

On Sat, Apr 18, 2020 at 9:31 AM Jon Peck notifications@github.com wrote:

Your R Essentials appears not to be working. Try installing the Essentials starting Statistics as Admin. If that doesn't work, make sure that you have installed fixpack 2 for V25.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615872415, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQMVGOQ3YISXFZONDTTRNGTUNANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

Use Admin mode and try the Essentials install again and

marlone-henderson commented 4 years ago

I ran that syntax command you recommended after installing the fixpack 2 and received this message in the output:

On Sat, Apr 18, 2020 at 8:48 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Hi. Okay, I just installed the fixpack 2 for 25. What do you recommend that I do next?

On Sat, Apr 18, 2020 at 9:31 AM Jon Peck notifications@github.com wrote:

Your R Essentials appears not to be working. Try installing the Essentials starting Statistics as Admin. If that doesn't work, make sure that you have installed fixpack 2 for V25.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615872415, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQMVGOQ3YISXFZONDTTRNGTUNANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

What message?

marlone-henderson commented 4 years ago

Hi. This message appeared after I ran the syntax command:

Warning messages: 1: In doTryCatch(return(expr), name, parentenv, handler) : unable to load shared object '/Library/Frameworks/R.framework/Resources/modules//R_X11.so': dlopen(/Library/Frameworks/R.framework/Resources/modules//R_X11.so, 6): Library not loaded: /opt/X11/lib/libSM.6.dylib Referenced from: /Library/Frameworks/R.framework/Resources/modules//R_X11.so Reason: image not found 2: In redirection() : Unable to open connection to X11 display. R version 3.3.3 (2017-03-06) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X El Capitan 10.11.6

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/en_US.ISO8859-1/en_US.UTF-8/en_US.ISO8859-1

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] spss250_25.0.0.0 Error in library(pscl) : there is no package called ‘pscl’

On Sat, Apr 18, 2020 at 9:53 AM Jon Peck notifications@github.com wrote:

What message?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615875550, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQOJFWPVBNFXNZXN4W3RNGWGXANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

Ah. R requires an X11 package to display output, but some Mac installations do not include it :-(

See this link to install it. You might have to install the Essentials again afterwards. https://support.apple.com/en-us/HT201341

marlone-henderson commented 4 years ago

Okay. Will this allow me to eventually configure the R3.3 in spss? I haven't been able to find the location of the R 3.3 following the instructions I found online.

On Sat, Apr 18, 2020 at 9:59 AM Jon Peck notifications@github.com wrote:

Ah. R requires an X11 package to display output, but some Mac installations do not include it :-(

See this link to install it. You might have to install the Essentials again afterwards. https://support.apple.com/en-us/HT201341

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615876437, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQJ4KWA5OYW3Z5JY45TRNGW5DANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

See this link for the Essentials. They have been reorganizing the websites, so this is a little hard to find at the moment. https://community.ibm.com/community/user/datascience/viewdocument/get-essentials-for-spss?CommunityKey=886b6874-0fb1-402c-8243-c70ef8179a99&tab=librarydocuments

marlone-henderson commented 4 years ago

Hi. Yes, these are the instructions I found. However, this is where I keep running into problems. According to these instructions, the R default directory for OSX is /Library/Frameworks/R.framework/Versions/3.3/Resources.

However, when I click on R3.3 configuration and browse and try to follow the instructions, I don't see a "Frameworks" folder in library folder. I've scoured the internet trying to see if anyone has encountered a similar problem, but I haven't seen a solution (or at least a solution that I understand). I re-ran your command in the system after installing that X11 package and this is what I received:

R version 3.3.3 (2017-03-06) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X El Capitan 10.11.6

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/en_US.ISO8859-1/en_US.UTF-8/en_US.ISO8859-1

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] spss250_25.0.0.0 Error in library(pscl) : there is no package called ‘pscl’

I suspect this has to do with the fact the R.3.3 isn't configured. Any ideas about how I can successfully configure it?

On Sat, Apr 18, 2020 at 10:24 AM Jon Peck notifications@github.com wrote:

See this link for the Essentials. They have been reorganizing the websites, so this is a little hard to find at the moment.

https://community.ibm.com/community/user/datascience/viewdocument/get-essentials-for-spss?CommunityKey=886b6874-0fb1-402c-8243-c70ef8179a99&tab=librarydocuments

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615880053, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQNOT3V77Y6N34VVB33RNGZ3BANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

I'm not a Mac person, but the X11 error is gone, which is progress, so try running this to install the missing pscl package. begin program r. install.packages("pscl") end program.

marlone-henderson commented 4 years ago

Let me say, thank you for your assistance with all of this, especially on a Saturday morning.

I ran that command and then received this message:

Error: End of procedure Installing package into ‘/Users/mdh2449/Library/Application Support/IBM/SPSS/Statistics/25/extensions’ (as ‘lib’ is unspecified) --- Please select a CRAN mirror for use in this session --- Error in contrib.url(repos, "source") : trying to use CRAN without setting a mirror

On Sat, Apr 18, 2020 at 10:44 AM Jon Peck notifications@github.com wrote:

I'm not a Mac person, but the X11 error is gone, which is progress, so try running this to install the missing pscl package. begin program r. install.packages("pscl") end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615882831, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQO7X5GCT6QCTPOXBADRNG4ERANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

Oh wait. Let me try that again. I didn't realize I needed to select a specific location in the CRAN

On Sat, Apr 18, 2020 at 9:46 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Let me say, thank you for your assistance with all of this, especially on a Saturday morning.

I ran that command and then received this message:

Error: End of procedure

Installing package into ‘/Users/mdh2449/Library/Application Support/IBM/SPSS/Statistics/25/extensions’ (as ‘lib’ is unspecified) --- Please select a CRAN mirror for use in this session --- Error in contrib.url(repos, "source") : trying to use CRAN without setting a mirror

On Sat, Apr 18, 2020 at 10:44 AM Jon Peck notifications@github.com wrote:

I'm not a Mac person, but the X11 error is gone, which is progress, so try running this to install the missing pscl package. begin program r. install.packages("pscl") end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615882831, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQO7X5GCT6QCTPOXBADRNG4ERANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

Okay, I choose a USA texas location and then got this message:

There is a binary version available but the source version is later: binary source needs_compilation pscl 1.5.2 1.5.5 TRUE

Do you want to install from sources the package which needs compilation? trying URL ' https://cran.revolutionanalytics.com/bin/macosx/mavericks/contrib/3.3/pscl_1.5.2.tgz ' Content type 'application/octet-stream' length 3340855 bytes (3.2 MB)

downloaded 3.2 MB

The downloaded binary packages are in /tmp/Rtmpk3yBbc/downloaded_packages

On Sat, Apr 18, 2020 at 9:47 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Oh wait. Let me try that again. I didn't realize I needed to select a specific location in the CRAN

On Sat, Apr 18, 2020 at 9:46 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Let me say, thank you for your assistance with all of this, especially on a Saturday morning.

I ran that command and then received this message:

Error: End of procedure

Installing package into ‘/Users/mdh2449/Library/Application Support/IBM/SPSS/Statistics/25/extensions’ (as ‘lib’ is unspecified) --- Please select a CRAN mirror for use in this session --- Error in contrib.url(repos, "source") : trying to use CRAN without setting a mirror

On Sat, Apr 18, 2020 at 10:44 AM Jon Peck notifications@github.com wrote:

I'm not a Mac person, but the X11 error is gone, which is progress, so try running this to install the missing pscl package. begin program r. install.packages("pscl") end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615882831, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQO7X5GCT6QCTPOXBADRNG4ERANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

And I reran that syntax command you first sent me and got this:

R version 3.3.3 (2017-03-06) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X El Capitan 10.11.6

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/en_US.ISO8859-1/en_US.UTF-8/en_US.ISO8859-1

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] spss250_25.0.0.0

loaded via a namespace (and not attached): [1] tools_3.3.3 tcltk_3.3.3 Classes and Methods for R developed in the Political Science Computational Laboratory Department of Political Science Stanford University Simon Jackman hurdle and zeroinfl functions by Achim Zeileis

Does this mean it's installed?

On Sat, Apr 18, 2020 at 9:49 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Okay, I choose a USA texas location and then got this message:

There is a binary version available but the source version is later: binary source needs_compilation pscl 1.5.2 1.5.5 TRUE

Do you want to install from sources the package which needs compilation? trying URL ' https://cran.revolutionanalytics.com/bin/macosx/mavericks/contrib/3.3/pscl_1.5.2.tgz ' Content type 'application/octet-stream' length 3340855 bytes (3.2 MB)

downloaded 3.2 MB

The downloaded binary packages are in /tmp/Rtmpk3yBbc/downloaded_packages

On Sat, Apr 18, 2020 at 9:47 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Oh wait. Let me try that again. I didn't realize I needed to select a specific location in the CRAN

On Sat, Apr 18, 2020 at 9:46 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Let me say, thank you for your assistance with all of this, especially on a Saturday morning.

I ran that command and then received this message:

Error: End of procedure

Installing package into ‘/Users/mdh2449/Library/Application Support/IBM/SPSS/Statistics/25/extensions’ (as ‘lib’ is unspecified) --- Please select a CRAN mirror for use in this session --- Error in contrib.url(repos, "source") : trying to use CRAN without setting a mirror

On Sat, Apr 18, 2020 at 10:44 AM Jon Peck notifications@github.com wrote:

I'm not a Mac person, but the X11 error is gone, which is progress, so try running this to install the missing pscl package. begin program r. install.packages("pscl") end program.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615882831, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQO7X5GCT6QCTPOXBADRNG4ERANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

yes, probably. Try STATS ZEROINFL now.

marlone-henderson commented 4 years ago

Hi. Yes, it appears to be running.

Thank you :)

On Sat, Apr 18, 2020 at 10:58 AM Jon Peck notifications@github.com wrote:

yes, probably. Try STATS ZEROINFL now.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615885115, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQNHFIKLXEF47DFTJYDRNG53DANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

Great. Now the hard part is up to you.

marlone-henderson commented 4 years ago

Hi. Sorry to ping you again about this. So, the zero inflated negative binomial was running fine with two of my datasets, but then I just tried it again on a much larger data set and receive this message:

system is computationally singular: reciprocal condition number = 3.8865e-23

Error: End of procedure.

Have you ever heard of this kind of error before?

On Sat, Apr 18, 2020 at 11:04 AM Jon Peck notifications@github.com wrote:

Great. Now the hard part is up to you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615886037, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQMCG6YPEIXM62OI6RTRNG6OTANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

I tried simplifying my analyzing and rerunning the zero inflated analyses and actually got a different error message:

Warning message: In sqrt(diag(object$vcov)) : NaNs produced

I found someone who encountered a similar problem here: https://stats.stackexchange.com/questions/209211/zero-inflated-poisson-regression-warning-message-in-sqrtdiagobjectvcov

But, I don't see a solution

On Sat, Apr 18, 2020 at 11:48 AM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Hi. Sorry to ping you again about this. So, the zero inflated negative binomial was running fine with two of my datasets, but then I just tried it again on a much larger data set and receive this message:

system is computationally singular: reciprocal condition number = 3.8865e-23

Error: End of procedure.

Have you ever heard of this kind of error before?

On Sat, Apr 18, 2020 at 11:04 AM Jon Peck notifications@github.com wrote:

Great. Now the hard part is up to you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615886037, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQMCG6YPEIXM62OI6RTRNG6OTANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

The post you found is on the right track. R packages tend to not be too good at explaining numerical problems that arise in the estimation process. You probably have complete separation in your dataset. That means some variable or variables perfectly or nearly perfectly explain the zero vs nonzero values as well as predicting the nonzeros.

Try running a simple logistic model with a dichotomized dependent variable (0/nonzero) (GENLIN or LOGISTIC). You might compare that with a simple Poisson regression.

You might also consider a negative binomial model such as GENLIN can produce, since that allows for a degree of zero inflation.

I would have to know more about your model and the data to offer additional advice.

marlone-henderson commented 4 years ago

The negative binomial in GENLIN does run. That was initially the analyses I was using, but then I received some feedback that a zero inflated negative binomial might be more appropriate. Unfortunately the person who suggested that is not available to assist me. Hence, me trying to figure all of this out. Are you able to open attachments? If so, I'm happy to send my data as a CVS file. I'll include some notes about the columns

On Sat, Apr 18, 2020 at 3:19 PM Jon Peck notifications@github.com wrote:

The post you found is on the right track. R packages tend to not be too good at explaining numerical problems that arise in the estimation process. You probably have complete separation in your dataset. That means some variable or variables perfectly or nearly perfectly explain the zero vs nonzero values as well as predicting the nonzeros.

Try running a simple logistic model with a dichotomized dependent variable (0/nonzero) (GENLIN or LOGISTIC). You might compare that with a simple Poisson regression.

You might also consider a negative binomial model such as GENLIN can produce, since that allows for a degree of zero inflation.

I would have to know more about your model and the data to offer additional advice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615930694, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQNDCVFEHT3JV6TT62LRNH4M5ANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

Yes, I can take attachments. A sav file would be preferable to csv. What was the motivation for preferring a zero-inflated negative binomial model over a plain negative binomial model?

On Sat, Apr 18, 2020 at 1:56 PM marlone-henderson notifications@github.com wrote:

The negative binomial in GENLIN does run. That was initially the analyses I was using, but then I received some feedback that a zero inflated negative binomial might be more appropriate. Unfortunately the person who suggested that is not available to assist me. Hence, me trying to figure all of this out. Are you able to open attachments? If so, I'm happy to send my data as a CVS file. I'll include some notes about the columns

On Sat, Apr 18, 2020 at 3:19 PM Jon Peck notifications@github.com wrote:

The post you found is on the right track. R packages tend to not be too good at explaining numerical problems that arise in the estimation process. You probably have complete separation in your dataset. That means some variable or variables perfectly or nearly perfectly explain the zero vs nonzero values as well as predicting the nonzeros.

Try running a simple logistic model with a dichotomized dependent variable (0/nonzero) (GENLIN or LOGISTIC). You might compare that with a simple Poisson regression.

You might also consider a negative binomial model such as GENLIN can produce, since that allows for a degree of zero inflation.

I would have to know more about your model and the data to offer additional advice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615930694 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/APHTBQNDCVFEHT3JV6TT62LRNH4M5ANCNFSM4MLK2YJA

.

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615935894, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEWQO22HCUBCCGGON53PLRNIAYXANCNFSM4MLK2YJA .

-- Jon K Peck jkpeck@gmail.com

marlone-henderson commented 4 years ago

Well, I was following up with the editor of a journal that I'm considering submitting the manuscript to. I had planned to use negative binomial regression but was unsure how to calculate confidence intervals and effect sizes from that analyses. She eventually checked with one of her editors who handles stats questions, and he suggested that negative binomial wasn't appropriate. But he didn't offer a better analysis and instead suggested that I hire a stats consultant. At the same time, I had emailed William Gardner (because he had published a paper on ways to handle non-normal count data). But after a few emails he wrote "If you have lots of zeros, a negative binomial might not be the right distribution. Check out zero-inflated negative binomial regression." Honestly, I was hoping that things didn't change depending on the analysis. It looks like that is the case for my Experiment 1 and 2, but when I got to my third experiment, I ran into the issue I emailed about.

I'm attaching the Sav file. Ideally, I was hoping to run a model in which I had IV1 (construal), IV2 (contribution), the interaction (CC), with history (centered) included as a covariate, predicting my DV (amount). I'm predicting a significant interaction effect, and would follow up the interaction with specific comparisons while controlling for the history. The regular negative binomial shows that, but I'm stuck when it comes to the zero inflated version.

P.S. I had to create the CC variable because zero inflated didn't have a way to automatically testing the interaction between my IV1 and IV2.

On Sat, Apr 18, 2020 at 4:12 PM Jon Peck notifications@github.com wrote:

Yes, I can take attachments. A sav file would be preferable to csv. What was the motivation for preferring a zero-inflated negative binomial model over a plain negative binomial model?

On Sat, Apr 18, 2020 at 1:56 PM marlone-henderson < notifications@github.com> wrote:

The negative binomial in GENLIN does run. That was initially the analyses I was using, but then I received some feedback that a zero inflated negative binomial might be more appropriate. Unfortunately the person who suggested that is not available to assist me. Hence, me trying to figure all of this out. Are you able to open attachments? If so, I'm happy to send my data as a CVS file. I'll include some notes about the columns

On Sat, Apr 18, 2020 at 3:19 PM Jon Peck notifications@github.com wrote:

The post you found is on the right track. R packages tend to not be too good at explaining numerical problems that arise in the estimation process. You probably have complete separation in your dataset. That means some variable or variables perfectly or nearly perfectly explain the zero vs nonzero values as well as predicting the nonzeros.

Try running a simple logistic model with a dichotomized dependent variable (0/nonzero) (GENLIN or LOGISTIC). You might compare that with a simple Poisson regression.

You might also consider a negative binomial model such as GENLIN can produce, since that allows for a degree of zero inflation.

I would have to know more about your model and the data to offer additional advice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615930694

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/APHTBQNDCVFEHT3JV6TT62LRNH4M5ANCNFSM4MLK2YJA

.

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615935894 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAAEWQO22HCUBCCGGON53PLRNIAYXANCNFSM4MLK2YJA

.

-- Jon K Peck jkpeck@gmail.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615937881, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQKUIHPRUIIAXXBCXI3RNICTVANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

There was no attachment.

On Sat, Apr 18, 2020 at 2:43 PM marlone-henderson notifications@github.com wrote:

Well, I was following up with the editor of a journal that I'm considering submitting the manuscript to. I had planned to use negative binomial regression but was unsure how to calculate confidence intervals and effect sizes from that analyses. She eventually checked with one of her editors who handles stats questions, and he suggested that negative binomial wasn't appropriate. But he didn't offer a better analysis and instead suggested that I hire a stats consultant. At the same time, I had emailed William Gardner (because he had published a paper on ways to handle non-normal count data). But after a few emails he wrote "If you have lots of zeros, a negative binomial might not be the right distribution. Check out zero-inflated negative binomial regression." Honestly, I was hoping that things didn't change depending on the analysis. It looks like that is the case for my Experiment 1 and 2, but when I got to my third experiment, I ran into the issue I emailed about.

I'm attaching the Sav file. Ideally, I was hoping to run a model in which I had IV1 (construal), IV2 (contribution), the interaction (CC), with history (centered) included as a covariate, predicting my DV (amount). I'm predicting a significant interaction effect, and would follow up the interaction with specific comparisons while controlling for the history. The regular negative binomial shows that, but I'm stuck when it comes to the zero inflated version.

P.S. I had to create the CC variable because zero inflated didn't have a way to automatically testing the interaction between my IV1 and IV2.

On Sat, Apr 18, 2020 at 4:12 PM Jon Peck notifications@github.com wrote:

Yes, I can take attachments. A sav file would be preferable to csv. What was the motivation for preferring a zero-inflated negative binomial model over a plain negative binomial model?

On Sat, Apr 18, 2020 at 1:56 PM marlone-henderson < notifications@github.com> wrote:

The negative binomial in GENLIN does run. That was initially the analyses I was using, but then I received some feedback that a zero inflated negative binomial might be more appropriate. Unfortunately the person who suggested that is not available to assist me. Hence, me trying to figure all of this out. Are you able to open attachments? If so, I'm happy to send my data as a CVS file. I'll include some notes about the columns

On Sat, Apr 18, 2020 at 3:19 PM Jon Peck notifications@github.com wrote:

The post you found is on the right track. R packages tend to not be too good at explaining numerical problems that arise in the estimation process. You probably have complete separation in your dataset. That means some variable or variables perfectly or nearly perfectly explain the zero vs nonzero values as well as predicting the nonzeros.

Try running a simple logistic model with a dichotomized dependent variable (0/nonzero) (GENLIN or LOGISTIC). You might compare that with a simple Poisson regression.

You might also consider a negative binomial model such as GENLIN can produce, since that allows for a degree of zero inflation.

I would have to know more about your model and the data to offer additional advice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615930694

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/APHTBQNDCVFEHT3JV6TT62LRNH4M5ANCNFSM4MLK2YJA

.

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615935894

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AAAEWQO22HCUBCCGGON53PLRNIAYXANCNFSM4MLK2YJA

.

-- Jon K Peck jkpeck@gmail.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615937881 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/APHTBQKUIHPRUIIAXXBCXI3RNICTVANCNFSM4MLK2YJA

.

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615945114, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEWQLEXEIMKKWTWJ3BCQTRNIGFRANCNFSM4MLK2YJA .

-- Jon K Peck jkpeck@gmail.com

marlone-henderson commented 4 years ago

I tried dragging the file into the box but it says it does not support that file type. Let me try to put it in a zip file.

Experiment 3 stuck.sav.zip

marlone-henderson commented 4 years ago

Did that work?

JKPeck commented 4 years ago

So is this your syntax?

STATS ZEROINFL MODELSOURCE=ESTIMATE DEPENDENT=amount COUNTMODEL=CC construal contribution histC SAMEREGRESSORS=YES COUNTDIST=POISSON ZEROLINK=LOGIT /OPTIONS STARTVALUES=GENLIN OPTMETHOD=BFGS MAXITER=1000 TOL=0.0000000001 /SAVE WORKSPACEACTION=CLEAR.

I won't be able to look at this today, but my first question is whether you have the measurement levels set correctly. construal and contribution are set to nominal, so they will be treated as factors, and CC is set to scale. Since CC is the interaction of construal and contribution, that seems strange.

That could have a big effect on the results.

marlone-henderson commented 4 years ago

wow! yes, that's my model and now it's working. Good catch :)

Glad I sent you the data file :)

On Sat, Apr 18, 2020 at 5:42 PM Jon Peck notifications@github.com wrote:

So is this your syntax?

STATS ZEROINFL MODELSOURCE=ESTIMATE DEPENDENT=amount COUNTMODEL=CC construal contribution histC SAMEREGRESSORS=YES COUNTDIST=POISSON ZEROLINK=LOGIT /OPTIONS STARTVALUES=GENLIN OPTMETHOD=BFGS MAXITER=1000 TOL=0.0000000001 /SAVE WORKSPACEACTION=CLEAR.

I won't be able to look at this today, but my first question is whether you have the measurement levels set correctly. construal and contribution are set to nominal, so they will be treated as factors, and CC is set to scale. Since CC is the interaction of construal and contribution, that seems strange.

That could have a big effect on the results.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615954673, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQIZYMOPFOQSLSPQ7IDRNINEJANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

Well, I spoke too soon. When I run that model, it runs fine. However, when I select contribution = -1 and just run the model with construal and history , it doesn't run. I receive this error message:

non-finite value supplied by optim

Error: End of procedure.

I was trying to following up the significant interaction with specific comparisons.

On Sat, Apr 18, 2020 at 4:47 PM Marlone Henderson < marlone.henderson@gmail.com> wrote:

wow! yes, that's my model and now it's working. Good catch :)

Glad I sent you the data file :)

On Sat, Apr 18, 2020 at 5:42 PM Jon Peck notifications@github.com wrote:

So is this your syntax?

STATS ZEROINFL MODELSOURCE=ESTIMATE DEPENDENT=amount COUNTMODEL=CC construal contribution histC SAMEREGRESSORS=YES COUNTDIST=POISSON ZEROLINK=LOGIT /OPTIONS STARTVALUES=GENLIN OPTMETHOD=BFGS MAXITER=1000 TOL=0.0000000001 /SAVE WORKSPACEACTION=CLEAR.

I won't be able to look at this today, but my first question is whether you have the measurement levels set correctly. construal and contribution are set to nominal, so they will be treated as factors, and CC is set to scale. Since CC is the interaction of construal and contribution, that seems strange.

That could have a big effect on the results.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615954673, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQIZYMOPFOQSLSPQ7IDRNINEJANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

marlone-henderson commented 4 years ago

Actually, no that's not the model. Sorry, I just noticed the syntax differences. You have Poisson, but I'm trying to run zero inflated negative binomial. Here's my syntax.

STATS ZEROINFL MODELSOURCE=ESTIMATE DEPENDENT=amount COUNTMODEL=construal contribution CC histC SAMEREGRESSORS=YES COUNTDIST=NEGBIN ZEROLINK=LOGIT /OPTIONS STARTVALUES=GENLIN OPTMETHOD=BFGS MAXITER=1000 TOL=0.0000000001 /SAVE WORKSPACEACTION=CLEAR.

When I try that, it doesn't run.

On Sat, Apr 18, 2020 at 4:50 PM Marlone Henderson < marlone.henderson@gmail.com> wrote:

Well, I spoke too soon. When I run that model, it runs fine. However, when I select contribution = -1 and just run the model with construal and history , it doesn't run. I receive this error message:

non-finite value supplied by optim

Error: End of procedure.

I was trying to following up the significant interaction with specific comparisons.

On Sat, Apr 18, 2020 at 4:47 PM Marlone Henderson < marlone.henderson@gmail.com> wrote:

wow! yes, that's my model and now it's working. Good catch :)

Glad I sent you the data file :)

On Sat, Apr 18, 2020 at 5:42 PM Jon Peck notifications@github.com wrote:

So is this your syntax?

STATS ZEROINFL MODELSOURCE=ESTIMATE DEPENDENT=amount COUNTMODEL=CC construal contribution histC SAMEREGRESSORS=YES COUNTDIST=POISSON ZEROLINK=LOGIT /OPTIONS STARTVALUES=GENLIN OPTMETHOD=BFGS MAXITER=1000 TOL=0.0000000001 /SAVE WORKSPACEACTION=CLEAR.

I won't be able to look at this today, but my first question is whether you have the measurement levels set correctly. construal and contribution are set to nominal, so they will be treated as factors, and CC is set to scale. Since CC is the interaction of construal and contribution, that seems strange.

That could have a big effect on the results.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615954673, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQIZYMOPFOQSLSPQ7IDRNINEJANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy

JKPeck commented 4 years ago

97.7% of your amount values are 0. That makes me think that a zero-inflated model is not going to capture what is going on in these data. But I tried dichotomizing the amount variable and running a regular logistic regression, treating the independent variables as factors, but nothing was significant.

marlone-henderson commented 4 years ago

Gotcha . Okay , thank you

On Sat, Apr 18, 2020 at 5:43 PM Jon Peck notifications@github.com wrote:

97.7% of your amount values are 0. That makes me think that a zero-inflated model is not going to capture what is going on in these data. But I tried dichotomizing the amount variable and running a regular logistic regression, treating the independent variables as factors, but nothing was significant.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IBMPredictiveAnalytics/R_Essentials_Statistics/issues/11#issuecomment-615967824, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHTBQP2RMFSOXVRNE5NWTDRNIUJFANCNFSM4MLK2YJA .

-- You have responsibilities, in short, to use your talents for the benefit of the society which helped develop those talents.

~ John F. Kennedy