jhelvy / cbcTools

An R package with tools for designing choice based conjoint (cbc) survey experiments and conducting power analyses
https://jhelvy.github.io/cbcTools/
Other
5 stars 5 forks source link

cbc_power - "NA" generated for 'estimated coefficient' and 'standard errors' in Output #6

Closed STEVEAP28 closed 1 year ago

STEVEAP28 commented 1 year ago

John --

When running cbc_tools with the Example from (https://rdrr.io/github/jhelvy/cbcTools/man/cbc_power.html) I get the following Output with "NA"s for Estimated Coefficients and Standard Errors:

A simple conjoint experiment about apples

library(cbcTools)

Generate all possible profiles

profiles <- cbc_profiles(

  • price = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),
  • type = c("Fuji", "Gala", "Honeycrisp"),
  • freshness = c('Poor', 'Average', 'Excellent')
  • )

Make a randomized survey design

design <- cbc_design(

  • profiles = profiles,
  • n_resp = 300, # Number of respondents
  • n_alts = 3, # Number of alternatives per question
  • n_q = 6 # Number of questions per respondent
  • )

Simulate random choices

data <- cbc_choices(

  • design = design,
  • obsID = "obsID"
  • )

Conduct a power analysis

power <- cbc_power(

  • data = data,
  • pars = c("price", "type", "freshness"),
  • outcome = "choice",
  • obsID = "obsID",
  • nbreaks = 10,
  • n_q = 6
  • ) Estimating models using 3 cores... done!

head(power) sampleSize coef est se 1 30 price NA NA 2 30 typeGala NA NA 3 30 typeHoneycrisp NA NA 4 30 freshnessAverage NA NA 5 30 freshnessExcellent NA NA 6 60 price NA NA


I re-ran this several times -- but still get this "NA" Result. Am I missing something?

jhelvy commented 1 year ago

You sometimes can get NA values with small sample sizes. That's my first guess of what's happening since you used head() to only view the first few rows of the result.

In the example that you referenced here, I cannot replicate the NA result. Here's what I get with that example:

set.seed(1234)
library(cbcTools)

# A simple conjoint experiment about apples

# Generate all possible profiles
profiles <- cbc_profiles(
  price     = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),
  type      = c("Fuji", "Gala", "Honeycrisp"),
  freshness = c('Poor', 'Average', 'Excellent')
)

# Make a randomized survey design
design <- cbc_design(
  profiles = profiles,
  n_resp   = 300, # Number of respondents
  n_alts   = 3, # Number of alternatives per question
  n_q      = 6 # Number of questions per respondent
)

# Simulate random choices
data <- cbc_choices(
  design = design,
  obsID  = "obsID"
)

# Conduct a power analysis
power <- cbc_power(
  data    = data,
  pars    = c("price", "type", "freshness"),
  outcome = "choice",
  obsID   = "obsID",
  nbreaks = 10,
  n_q     = 6
)

Now preview the results:

head(power)
#>   sampleSize               coef         est         se
#> 1         30              price -0.07194700 0.07609126
#> 2         30           typeGala  0.34172607 0.22791822
#> 3         30     typeHoneycrisp  0.50663546 0.22807907
#> 4         30   freshnessAverage  0.01794571 0.22327851
#> 5         30 freshnessExcellent  0.13248210 0.22751630
#> 6         60              price  0.02152378 0.05237815
tail(power)
#>    sampleSize               coef         est         se
#> 45        270 freshnessExcellent -0.17115823 0.07369265
#> 46        300              price  0.03658552 0.02240770
#> 47        300           typeGala  0.05830221 0.07098261
#> 48        300     typeHoneycrisp  0.04437270 0.07072744
#> 49        300   freshnessAverage -0.20545772 0.06966451
#> 50        300 freshnessExcellent -0.17473701 0.07007619
STEVEAP28 commented 1 year ago

John --

I copied and ran your code based on your initial reply. Unfortunately, I am still getting "NA" in the Estimates and Standard Errors (See below):

set.seed(1234) library(cbcTools) Version: 0.2.0 Author: John Paul Helveston (George Washington University)

Consider submitting praise at https://github.com/jhelvy/cbcTools/issues/3.

Please cite the package in your publications, see: citation("cbcTools")

A simple conjoint experiment about apples

Generate all possible profiles

profiles <- cbc_profiles(

  • price = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),
  • type = c("Fuji", "Gala", "Honeycrisp"),
  • freshness = c('Poor', 'Average', 'Excellent')
  • )

Make a randomized survey design

design <- cbc_design(

  • profiles = profiles,
  • n_resp = 300, # Number of respondents
  • n_alts = 3, # Number of alternatives per question
  • n_q = 6 # Number of questions per respondent
  • )

Simulate random choices

data <- cbc_choices(

  • design = design,
  • obsID = "obsID"
  • )

Conduct a power analysis

power <- cbc_power(

  • data = data,
  • pars = c("price", "type", "freshness"),
  • outcome = "choice",
  • obsID = "obsID",
  • nbreaks = 10,
  • n_q = 6
  • ) Estimating models using 3 cores... done! head(power) sampleSize coef est se 1 30 price NA NA 2 30 typeGala NA NA 3 30 typeHoneycrisp NA NA 4 30 freshnessAverage NA NA 5 30 freshnessExcellent NA NA 6 60 price NA NA power sampleSize coef est se 1 30 price NA NA 2 30 typeGala NA NA 3 30 typeHoneycrisp NA NA 4 30 freshnessAverage NA NA 5 30 freshnessExcellent NA NA 6 60 price NA NA 7 60 typeGala NA NA 8 60 typeHoneycrisp NA NA 9 60 freshnessAverage NA NA 10 60 freshnessExcellent NA NA 11 90 price NA NA 12 90 typeGala NA NA 13 90 typeHoneycrisp NA NA 14 90 freshnessAverage NA NA 15 90 freshnessExcellent NA NA 16 120 price NA NA 17 120 typeGala NA NA 18 120 typeHoneycrisp NA NA 19 120 freshnessAverage NA NA 20 120 freshnessExcellent NA NA 21 150 price NA NA 22 150 typeGala NA NA 23 150 typeHoneycrisp NA NA 24 150 freshnessAverage NA NA 25 150 freshnessExcellent NA NA 26 180 price NA NA 27 180 typeGala NA NA 28 180 typeHoneycrisp NA NA 29 180 freshnessAverage NA NA 30 180 freshnessExcellent NA NA 31 210 price NA NA 32 210 typeGala NA NA 33 210 typeHoneycrisp NA NA 34 210 freshnessAverage NA NA 35 210 freshnessExcellent NA NA 36 240 price NA NA 37 240 typeGala NA NA 38 240 typeHoneycrisp NA NA 39 240 freshnessAverage NA NA 40 240 freshnessExcellent NA NA 41 270 price NA NA 42 270 typeGala NA NA 43 270 typeHoneycrisp NA NA 44 270 freshnessAverage NA NA 45 270 freshnessExcellent NA NA 46 300 price NA NA 47 300 typeGala NA NA 48 300 typeHoneycrisp NA NA 49 300 freshnessAverage NA NA 50 300 freshnessExcellent NA NA

Any other ideas to work around or through this? This is a useful function that would like to utilize

Look forward to your help.

Regards -- Steve

jhelvy commented 1 year ago

Steve, I've replicated the code in a Google colab notebook here

You can see that with a fresh installation of cbcTools (v0.2.0), the example runs without producing NA values.

Can you copy-paste that exact code from the colab notebook and post the results? There must be an issue somewhere, perhaps just a simple typo? Can you confirm that you have installed v0.2.0 from CRAN?

STEVEAP28 commented 1 year ago

Hi John ---

Yes I did a "fresh installation" of cbcTools (v0.2.0). I have uninstalled and installed about 3 times. Below is the 'output' from just the 'installed' process.

Maybe you can see something there that is causing this 'NA' issue - as I am still getting that Result (see Results after the 'install' step below, as well).

install.packages('cbcTools') Installing package into ‘C:/Users/Steve/Documents/R/win-library/4.0’ (as ‘lib’ is unspecified) installing the source package ‘cbcTools’

trying URL 'https://cran.rstudio.com/src/contrib/cbcTools_0.2.0.tar.gz' Content type 'application/x-gzip' length 1531222 bytes (1.5 MB) downloaded 1.5 MB

The downloaded source packages are in ‘C:\Users\Steve\AppData\Local\Temp\RtmpsdYCLM\downloaded_packages’

set.seed(1234) library(cbcTools) Version: 0.2.0 Author: John Paul Helveston (George Washington University)

Consider submitting praise at https://github.com/jhelvy/cbcTools/issues/3.

Please cite the package in your publications, see: citation("cbcTools")


Below are the colab notebook code and Results:

A simple conjoint experiment about apples

Generate all possible profiles

profiles <- cbc_profiles(

  • price = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),
  • type = c("Fuji", "Gala", "Honeycrisp"),
  • freshness = c('Poor', 'Average', 'Excellent')
  • )

    Make a randomized survey design

    design <- cbc_design(

  • profiles = profiles,
  • n_resp = 300, # Number of respondents
  • n_alts = 3, # Number of alternatives per question
  • n_q = 6 # Number of questions per respondent
  • )

    Simulate random choices

    data <- cbc_choices(

  • design = design,
  • obsID = "obsID"
  • )

    Conduct a power analysis

    power <- cbc_power(

  • data = data,
  • pars = c("price", "type", "freshness"),
  • outcome = "choice",
  • obsID = "obsID",
  • nbreaks = 10,
  • n_q = 6
  • ) Estimating models using 3 cores... done!

head(power) sampleSize coef est se 1 30 price NA NA 2 30 typeGala NA NA 3 30 typeHoneycrisp NA NA 4 30 freshnessAverage NA NA 5 30 freshnessExcellent NA NA 6 60 price NA NA tail(power) sampleSize coef est se 45 270 freshnessExcellent NA NA 46 300 price NA NA 47 300 typeGala NA NA 48 300 typeHoneycrisp NA NA 49 300 freshnessAverage NA NA 50 300 freshnessExcellent NA NA


Any other thoughts | ideas?

Regards --

Steve P.

STEVEAP28 commented 1 year ago

John --

I just submitted a post about this via GitHub.

Regards -- Steve P.

On Mon, Mar 6, 2023 at 5:27 PM John Helveston @.***> wrote:

Steve, I've replicated the code in a Google colab notebook here https://colab.research.google.com/drive/13LIuoyRexTzqsqcsNyriD03bB3VgFVsj?usp=sharing

You can see that with a fresh installation of cbcTools (v0.2.0), the example runs without producing NA values.

Can you copy-paste that exact code from the colab notebook and post the results? There must be an issue somewhere, perhaps just a simple typo? Can you confirm that you have installed v0.2.0 from CRAN?

— Reply to this email directly, view it on GitHub https://github.com/jhelvy/cbcTools/issues/6#issuecomment-1457114957, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6JFGDRC7ARK5DY35DS2KADW2ZQG5ANCNFSM6AAAAAAVQI2M5E . You are receiving this because you authored the thread.Message ID: @.***>

jhelvy commented 1 year ago

I'm at a bit of a loss here. I've ran the same code on multiple machines, never getting NA values.

Can you run sessionInfo() on your machine and copy over what is printed out? My only guess is that you have a version issue somewhere, perhaps some dependency that isn't updated?

This shouldn't be a problem as cbcTools should update all needed packages when installing, but perhaps I have an error in there somewhere.

STEVEAP28 commented 1 year ago

John --

Below is the sessionInfo() Output. Maybe something there (I hope).

sessionInfo()R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] cbcTools_0.2.0

loaded via a namespace (and not attached): [1] dplyr_1.0.1 crayon_1.3.4 grid_4.0.2 R6_2.4.1 lifecycle_1.0.0 [6] gtable_0.3.0 magrittr_1.5 scales_1.1.1 ggplot2_3.3.5 pillar_1.4.6 [11] rlang_0.4.10 rstudioapi_0.13 generics_0.0.2 vctrs_0.3.2 ellipsis_0.3.1 [16] logitr_1.0.1 tools_4.0.2 glue_1.4.1 purrr_0.3.4 munsell_0.5.0 [21] parallel_4.0.2 compiler_4.0.2 pkgconfig_2.0.3 colorspace_1.4-1 tidyselect_1.1.0 [26] tibble_3.0.3

On Tue, Mar 7, 2023 at 11:26 AM John Helveston @.***> wrote:

I'm at a bit of a loss here. I've ran the same code on multiple machines, never getting NA values.

Can you run sessionInfo() on your machine and copy over what is printed out? My only guess is that you have a version issue somewhere, perhaps some dependency that isn't updated?

This shouldn't be a problem as cbcTools should update all needed packages when installing, but perhaps I have an error in there somewhere.

— Reply to this email directly, view it on GitHub https://github.com/jhelvy/cbcTools/issues/6#issuecomment-1458459115, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6JFGDRUF7ZZZ7KALYK6NFTW25OSFANCNFSM6AAAAAAVQI2M5E . You are receiving this because you authored the thread.Message ID: @.***>

jhelvy commented 1 year ago

Okay this all looks good. The only thing slightly out of date is the R version, but it shouldn't be an issue. Maybe try updating to the latest R? I can't imagine that would change anything though.

My only other suggestion at this point is to perhaps screen record yourself walking through this example and post it on youtube. Maybe I'll catch something watching it run on your machine.

But really I'm stuck if I can't get the error to occur. The fact that it runs without NAs on the colab notebook suggests that the issue is probably something on your machine specifically (e.g. an outdated package). But even that looks okay based on the list I'm seeing here.

STEVEAP28 commented 1 year ago

John --

I think I may have figured it out with one of your suggestions -- an Outdated Package.

I looked at the package DEPENDENCIES and examined to see which ones needed to be Updated and then did so accordingly.

However -- The only 'difference' I see is that I do not get the same "Estimates" and "SE" figures as generated via the 'colab notebook ' despite setting the same seed to '1234'.

Thanks for your assistance on your cbc_Tools package.

Regards -- Steve P.

On Tue, Mar 7, 2023 at 11:54 AM John Helveston @.***> wrote:

Okay this all looks good. The only thing slightly out of date is the R version, but it shouldn't be an issue. Maybe try updating to the latest R? I can't imagine that would change anything though.

My only other suggestion at this point is to perhaps screen record yourself walking through this example and post it on youtube. Maybe I'll catch something watching it run on your machine.

But really I'm stuck if I can't get the error to occur. The fact that it runs without NAs on the colab notebook suggests that the issue is probably something on your machine specifically (e.g. an outdated package). But even that looks okay based on the list I'm seeing here.

— Reply to this email directly, view it on GitHub https://github.com/jhelvy/cbcTools/issues/6#issuecomment-1458507578, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6JFGDRWOSGC7TSPKSZCTFLW25R5NANCNFSM6AAAAAAVQI2M5E . You are receiving this because you authored the thread.Message ID: @.***>

jhelvy commented 1 year ago

Fantastic! Really glad we figured out the dependency issue. That said, this is still concerning as the packages should be updated upon installing the package with install.packages('cbcTools'). So I'll have to take a look at what's going on there.

As for the differences, I'll take a look as well to see if I can figure out the source of the issue. Can you post your SE results here?

STEVEAP28 commented 1 year ago

John --

Yes -- good news!!

Below is the Output for the entire run of the code in question that you wanted to see.

Regards -- Steve P.

set.seed(1234)> library(cbcTools)Registered S3 methods overwritten by 'tibble': method from format.tbl pillar print.tbl pillarVersion: 0.2.0 Author: John Paul Helveston (George Washington University)

Consider submitting praise athttps://github.com/jhelvy/cbcTools/issues/3.

Please cite the package in your publications, see: citation("cbcTools")Warning messages: 1: replacing previous import ‘ellipsis::check_dots_unnamed’ by ‘rlang::check_dots_unnamed’ when loading ‘tibble’ 2: replacing previous import ‘ellipsis::check_dots_used’ by ‘rlang::check_dots_used’ when loading ‘tibble’ 3: replacing previous import ‘ellipsis::check_dots_empty’ by ‘rlang::check_dots_empty’ when loading ‘tibble’ > > # A simple conjoint experiment about apples> > # Generate all possible profiles> profiles <- cbc_profiles(+ price = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),+ type = c("Fuji", "Gala", "Honeycrisp"),+ freshness = c('Poor', 'Average', 'Excellent')+ )> > # Make a randomized survey design> design <- cbc_design(+ profiles = profiles,+ n_resp = 300, # Number of respondents+ n_alts = 3, # Number of alternatives per question+ n_q = 6 # Number of questions per respondent+ )>

Simulate random choices> data <- cbc_choices(+ design =

design,+ obsID = "obsID"+ )> > > # Conduct a power analysis> power <- cbc_power(+ data = data,+ pars = c("price", "type", "freshness"),+ outcome = "choice",+ obsID = "obsID",+ nbreaks = 10,+ n_q = 6+ )Estimating models using 3 cores...done!> > head(power) sampleSize coef est se 1 30 price 0.05156809 0.06744069 2 30 typeGala 0.07141222 0.24363835 3 30 typeHoneycrisp 0.28785009 0.22567518 4 30 freshnessAverage -0.13833523 0.22695078 5 30 freshnessExcellent -0.43825846 0.23193178 6 60 price 0.06345388 0.04915863> tail(power) sampleSize coef est se 45 270 freshnessExcellent -0.07822489 0.07474956 46 300 price 0.02910679 0.02198652 47 300 typeGala -0.04227355 0.07151511 48 300 typeHoneycrisp -0.01387631 0.07098512 49 300 freshnessAverage 0.03019946 0.07031702 50 300 freshnessExcellent -0.05680868 0.07079312>

On Tue, Mar 7, 2023 at 2:56 PM John Helveston @.***> wrote:

Fantastic! Really glad we figured out the dependency issue. That said, this is still concerning as the packages should be updated upon installing the package with install.packages('cbcTools'). So I'll have to take a look at what's going on there.

As for the differences, I'll take a look as well to see if I can figure out the source of the issue. Can you post your SE results here?

— Reply to this email directly, view it on GitHub https://github.com/jhelvy/cbcTools/issues/6#issuecomment-1458755296, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6JFGDQSNYRZFMTVMCKP773W26HIRANCNFSM6AAAAAAVQI2M5E . You are receiving this because you authored the thread.Message ID: @.***>

jhelvy commented 1 year ago

More mysteries to solve...once again tricky as I am getting the same values on my local machine as those in the colab notebook.

So once again I suspect there is a very small difference between our system configurations leading to slight differences. Given how small the differences are though, I suspect this is the result of a difference in the random number generator. It may be that the same seed produces slight differences depending on the version of R you're using, or perhaps some other package. All that said, I'm not too concerned here as these results are quite similar.

I do, however, need to update the package DESCRIPTION file to make sure the appropriate dependencies are in place. Can you tell / do you remember which specific packages you updated? If so, I can force a minimum version so that when users install {cbcTools} the appropriate dependencies are installed with it.

STEVEAP28 commented 1 year ago

John -- Yes -- perhaps the 'seed' may be generating different random numbers based on the Version of R. Always new 'adventures' in issues when working open source tools I guess.

Anyway -- the packages I updated are the ones I bolded below.

Regardless, it may be necessary that users ensure that Some, if not All of these Dependencies are updated.

Please note -- I did install logitr the other day, as well.

Thanks again. Let me know if you need anything else.

Regards -- Steve P.


Installing package into ‘/usr/local/lib/R/site-library’

(as ‘lib’ is unspecified)

also installing the dependencies ‘zoo’, ‘sandwich’, ‘httpuv’, ‘xtable’, ‘fontawesome’, ‘sourcetools’, ‘later’, ‘promises’, ‘rbibutils’, ‘ mvtnorm’, ‘gmm’, ‘Formula’, ‘shiny’, ‘Rcpp’, ‘Rdpack’, ‘ tmvtnorm’, ‘dfidx’, ‘RcppArmadillo’, ‘nloptr’, ‘rngWELL’, ‘fastDummies’, ‘idefix’, ‘logitr’, ‘randtoolbox’

On Tue, Mar 7, 2023 at 3:51 PM John Helveston @.***> wrote:

More mysteries to solve...once again tricky as I am getting the same values on my local machine as those in the colab notebook https://colab.research.google.com/drive/13LIuoyRexTzqsqcsNyriD03bB3VgFVsj?usp=sharing .

So once again I suspect there is a very small difference between our system configurations leading to slight differences. Given how small the differences are though, I suspect this is the result of a difference in the random number generator. It may be that the same seed produces slight differences depending on the version of R you're using, or perhaps some other package. All that said, I'm not too concerned here as these results are quite similar.

I do, however, need to update the package DESCRIPTION https://github.com/jhelvy/cbcTools/blob/main/DESCRIPTION file to make sure the appropriate dependencies are in place. Can you tell / do you remember which specific packages you updated? If so, I can force a minimum version so that when users install {cbcTools} the appropriate dependencies are installed with it.

— Reply to this email directly, view it on GitHub https://github.com/jhelvy/cbcTools/issues/6#issuecomment-1458857835, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6JFGDWBADZSD4PZYFFVECDW26NUBANCNFSM6AAAAAAVQI2M5E . You are receiving this because you authored the thread.Message ID: @.***>